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Preface: introduction and objectives 


The digital communication industry is an enormous and rapidly growing industry, roughly com- 
parable in size to the computer industry. The objective of this text is to study those aspects 
of digital communication systems that are unique to those systems. That is, rather than focus- 
ing on hardware and software for these systems, which is much like hardware and software for 
many other kinds of systems, we focus on the fundamental system aspects of modern digital 
communication. 


Digital communication is a field in which theoretical ideas have had an unusually powerful 
impact on system design and practice. The basis of the theory was developed in 1948 by 
Claude Shannon, and is called information theory. For the first 25 years or so of its existence, 
information theory served as a rich source of academic research problems and as a tantalizing 
suggestion that communication systems could be made more efficient and more reliable by using 
these approaches. Other than small experiments and a few highly specialized military systems, 
the theory had little interaction with practice. By the mid 1970’s, however, mainstream systems 
using information theoretic ideas began to be widely implemented. The first reason for this was 
the increasing number of engineers who understood both information theory and communication 
system practice. The second reason was that the low cost and increasing processing power 
of digital hardware made it possible to implement the sophisticated algorithms suggested by 
information theory. The third reason was that the increasing complexity of communication 
systems required the architectural principles of information theory. 


The theoretical principles here fall roughly into two categories - the first provide analytical tools 
for determining the performance of particular systems, and the second put fundamental limits on 
the performance of any system. Much of the first category can be understood by engineering un- 
dergraduates, while the second category is distinctly graduate in nature. It is not that graduate 
students know so much more than undergraduates, but rather that undergraduate engineering 
students are trained to master enormous amounts of detail and to master the equations that deal 
with that detail. They are not used to the patience and deep thinking required to understand 
abstract performance limits. This patience comes later with thesis research. 


My original purpose was to write an undergraduate text on digital communication, but experi- 
ence teaching this material over a number of years convinced me that I could not write an honest 
exposition of principles, including both what is possible and what is not possible, without losing 
most undergraduates. There are many excellent undergraduate texts on digital communication 
describing a wide variety of systems, and I didn’t see the need for another. Thus this text is 
now aimed at graduate students, but accessible to patient undergraduates. 


The relationship between theory, problem sets, and engineering/design in an academic subject is 
rather complex. The theory deals with relationships and analysis for models of real systems. A 
good theory (and information theory is one of the best) allows for simple analysis of simplified 
models. It also provides structural principles that allow insights from these simple models 
to be applied to more complex and realistic models. Problem sets provide students with an 
opportunity to analyze these highly simplified models, and, with patience, to start to understand 
the general principles. Engineering deals with making the approximations and judgment calls to 
create simple models that focus on the critical elements of a situation, and from there to design 
workable systems. 


The important point here is that engineering (at this level) cannot really be separated from the- 
ory. Engineering is necessary to choose appropriate theoretical models, and theory is necessary 
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to find the general properties of those models. To oversimplify it, engineering determines what 
the reality is and theory determines the consequences and structure of that reality. At a deeper 
level, however, the engineering perception of reality heavily depends on the perceived structure 
(all of us carry oversimplified models around in our heads). Similarly, the structures created by 
theory depend on engineering common sense to focus on important issues. Engineering some- 
times becomes overly concerned with detail, and theory overly concerned with mathematical 
niceties, but we shall try to avoid both these excesses here. 


Each topic in the text is introduced with highly oversimplified toy models. The results about 
these toy models are then related to actual communication systems and this is used to generalize 
the models. We then iterate back and forth between analysis of models and creation of models. 
Understanding the performance limits on classes of models is essential in this process. 


There are many exercises designed to help understand each topic. Some give examples showing 
how an analysis breaks down if the restrictions are violated. Since analysis always treats models 
rather than reality, these examples build insight into how the results about models apply to real 
systems. Other exercises apply the text results to very simple cases and others generalize the 
results to more complex systems. Yet others explore the sense in which theoretical models apply 
to particular practical problems. 


It is important to understand that the purpose of the exercises is not so much to get the ‘answer’ 
as to acquire understanding. Thus students using this text will learn much more if they discuss 
the exercises with others and think about what they have learned after completing the exercise. 
The point is not to manipulate equations (which computers can now do better than students) 
but rather to understand the equations (which computers can not do). 


As pointed out above, the material here is primarily graduate in terms of abstraction and pa- 
tience, but requires only a knowledge of elementary probability, linear systems, and simple 
mathematical abstraction, so it can be understood at the undergraduate level. For both under- 
graduates and graduates, I feel strongly that learning to reason about engineering material is 
more important, both in the workplace and in further education, than learning to pattern match 
and manipulate equations. 


Most undergraduate communication texts aim at familiarity with a large variety of different 
systems that have been implemented historically. This is certainly valuable in the workplace, at 
least for the near term, and provides a rich set of examples that are valuable for further study. 
The digital communication field is so vast, however, that learning from examples is limited, 
and in the long term it is necessary to learn the underlying principles. The examples from 
undergraduate courses provide a useful background for studying these principles, but the ability 
to reason abstractly that comes from elementary pure mathematics courses is equally valuable. 


Most graduate communication texts focus more on the analysis of problems with less focus on 
the modeling, approximation, and insight needed to see how these problems arise. Our objective 
here is to use simple models and approximations as a way to understand the general principles. 
We will use quite a bit of mathematics in the process, but the mathematics will be used to 
establish general results precisely rather than to carry out detailed analyses of special cases. 
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Chapter 1 


Introduction to digital 
communication 


Communication has been one of the deepest needs of the human race throughout recorded 
history. It is essential to forming social unions, to educating the young, and to expressing a 
myriad of emotions and needs. Good communication is central to a civilized society. 


The various communication disciplines in engineering have the purpose of providing technological 
aids to human communication. One could view the smoke signals and drum rolls of primitive 
societies as being technological aids to communication, but communication technology as we 
view it today became important with telegraphy, then telephony, then video, then computer 
communication, and today the amazing mixture of all of these in inexpensive, small portable 
devices. 


Initially these technologies were developed as separate networks and were viewed as having little 
in common. As these networks grew, however, the fact that all parts of a given network had to 
work together, coupled with the fact that different components were developed at different times 
using different design methodologies, caused an increased focus on the underlying principles and 
architectural understanding required for continued system evolution. 


This need for basic principles was probably best understood at American Telephone and Tele- 
graph (AT&T) where Bell Laboratories was created as the research and development arm of 
AT&T. The Math center at Bell Labs became the predominant center for communication re- 
search in the world, and held that position until quite recently. The central core of the principles 
of communication technology were developed at that center. 


Perhaps the greatest contribution from the math center was the creation of Information Theory 
[27] by Claude Shannon in 1948. For perhaps the first 25 years of its existence, Information 
Theory was regarded as a beautiful theory but not as a central guide to the architecture and 
design of communication systems. After that time, however, both the device technology and the 
engineering understanding of the theory were sufficient to enable system development to follow 
information theoretic principles. 


A number of information theoretic ideas and how they affect communication system design will 
be explained carefully in subsequent chapters. One pair of ideas, however, is central to almost 
every topic. The first is to view all communication sources, e.g., speech waveforms, image 
waveforms, and text files, as being representable by binary sequences. The second is to design 
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communication systems that first convert the source output into a binary sequence and then 
convert that binary sequence into a form suitable for transmission over particular physical media 
such as cable, twisted wire pair, optical fiber, or electromagnetic radiation through space. 


Digital communication systems, by definition, are communication systems that use such a digital! 
sequence as an interface between the source and the channel input (and similarly between the 
channel output and final destination) (see Figure 1.1). 


Source Source : |Channel 
Encoder | |Encoder 
‘ Binary 
Interface Channel 
Destination | Source : | Channel 
Decoder | | Decoder 


Figure 1.1: Placing a binary interface between source and channel. The source en- 
coder converts the source output to a binary sequence and the channel encoder (often 
called a modulator) processes the binary sequence for transmission over the channel. 
The channel decoder (demodulator) recreates the incoming binary sequence (hopefully 
reliably), and the source decoder recreates the source output. 


The idea of converting an analog source output to a binary sequence was quite revolutionary 
in 1948, and the notion that this should be done before channel processing was even more 
revolutionary. By today, with digital cameras, digital video, digital voice, etc., the idea of 
digitizing any kind of source is commonplace even among the most technophobic. The notion 
of a binary interface before channel transmission is almost as commonplace. For example, we 
all refer to the speed of our internet connection in bits per second. 


There are a number of reasons why communication systems now usually contain a binary inter- 
face between source and channel (i.e., why digital communication systems are now standard). 
These will be explained with the necessary qualifications later, but briefly they are as follows: 


e Digital hardware has become so cheap, reliable, and miniaturized, that digital interfaces are 
eminently practical. 

e A standardized binary interface between source and channel simplifies implementation and 
understanding, since source coding/decoding can be done independently of the channel, 
and, similarly, channel coding/decoding can be done independently of the source. 


‘A digital sequence is a sequence made up of elements from a finite alphabet (e.g., the binary digits {0, 1}, 
the decimal digits {0,1,... ,9} , or the letters of the English alphabet) . The binary digits are almost universally 
used for digital communication and storage, so we only distinguish digital from binary in those few places where 
the difference is significant. 
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e A standardized binary interface between source and channel simplifies networking, which 
now reduces to sending binary sequences through the network. 


e One of the most important of Shannon’s information theoretic results is that if a source 
can be transmitted over a channel in any way at all, it can be transmitted using a binary 
interface between source and channel. This is known as the source/channel separation 
theorem. 


In the remainder of this chapter, the problems of source coding and decoding and channel coding 
and decoding are briefly introduced. First, however, the notion of layering in a communication 
system is introduced. One particularly important example of layering was already introduced in 
Figure 1.1, where source coding and decoding are viewed as one layer and channel coding and 
decoding are viewed as another layer. 


1.1 Standardized interfaces and layering 


Large communication systems such as the Public Switched Telephone Network (PSTN) and the 
Internet have incredible complexity, made up of an enormous variety of equipment made by 
different manufacturers at different times following different design principles. Such complex 
networks need to be based on some simple architectural principles in order to be understood, 
managed, and maintained. 


Two such fundamental architectural principles are standardized interfaces and layering. 


A standardized interface allows the user or equipment on one side of the interface to ignore all 
details about the other side of the interface except for certain specified interface characteris- 
tics. For example, the binary interface? above allows the source coding/decoding to be done 
independently of the channel coding/decoding. 


The idea of layering in communication systems is to break up communication functions into a 
string of separate layers as illustrated in Figure 1.2. 


Each layer consists of an input module at the input end of a communcation system and a ‘peer’ 
output module at the other end. The input module at layer 7 processes the information received 
from layer i+1 and sends the processed information on to layer i—1. The peer output module at 
layer 7 works in the opposite direction, processing the received information from layer i—1 and 
sending it on to layer 7. 


As an example, an input module might receive a voice waveform from the next higher layer and 
convert the waveform into a binary data sequence that is passed on to the next lower layer. The 
output peer module would receive a binary sequence from the next lower layer at the output 
and convert it back to a speech waveform. 


As another example, a modem consists of an input module (a modulator) and an output module 
(a demodulator). The modulator receives a binary sequence from the next higher input layer 
and generates a corresponding modulated waveform for transmission over a channel. The peer 
module is the remote demodulator at the other end of the channel. It receives a more-or- 
less faithful replica of the transmitted waveform and reconstructs a typically faithful replica 
of the binary sequence. Similarly, the local demodulator is the peer to a remote modulator 
(often collocated with the remote demodulator above). Thus a modem is an input module for 


?The use of a binary sequence at the interface is not quite enough to specify it, as will be discussed later. 
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input jinput | Jinput input 
module 7 , {module 7-1 module 1 
interface interface 
1 to 71-1 i—1 to i-2 
layer 7 layer i—1 layer 1 channel 
interface interface 
i—1 tot 1—2 to i-1 


output | output ' loutput a output 
module i module i—1 module 1 


Figure 1.2: Layers and interfaces: The specification of the interface between layers 
i and i—1 should specify how input module i communicates with input module 7—1, 
how the corresponding output modules communicate, and, most important, the in- 
put/output behavior of the system to the right of interface. The designer of layer i—1 
uses the input/output behavior of the layers to the right of i—1 to produce the required 
input/output performance to the right of layer 7. Later examples will show how this 


multi-layer process can simplify the overall system design. 


communication in one direction and an output module for independent communication in the 
opposite direction. Later chapters consider modems in much greater depth, including how noise 
affects the channel waveform and how that affects the reliability of the recovered binary sequence 
at the output. For now, however, it is enough to simply view the modulator as converting a 
binary sequence to a waveform, with the peer demodulator converting the waveform back to the 
binary sequence. 


As another example, the source coding/decoding layer for a waveform source can be split into 3 
layers as shown in Figure 1.3. One of the advantages of this layering is that discrete sources are 
an important topic in their own right (treated in Chapter 2) and correspond to the inner layer 
of Figure 1.3. Quantization is also an important topic in its own right, (treated in Chapter 3). 
After both of these are understood, waveform sources become quite simple to understand. 


The channel coding/decoding layer can also be split into several layers, but there are a number 
of ways to do this which will be discussed later. For example, binary error-correction cod- 
ing/decoding can be used as an outer layer with modulation and demodulation as an inner 
layer, but it will be seen later that there are a number of advantages in combining these layers 
into what is called coded modulation.? Even here, however, layering is important, but the layers 
are defined differently for different purposes. 


It should be emphasized that layering is much more than simply breaking a system into com- 
ponents. The input and peer output in each layer encapsulate all the lower layers, and all these 
lower layers can be viewed in aggregate as a communication channel. Similarly, the higher layers 
can be viewed in aggregate as a simple source and destination. 


The above discussion of layering implicitly assumed a point-to-point communication system 
with one source, one channel, and one destination. Network situations can be considerably 
more complex. With broadcasting, an input module at one layer may have multiple peer output 
modules. Similarly, in multiaccess communication a multiplicity of input modules have a single 


3Notation is nonstandard here. A channel coder (including both coding and modulation) is often referred to 
(both here and elsewhere) as a modulator. 
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input i 
eae! sampler +++ quantizer (+4 isciete 
waveform | | encoder | 
analog symbol binary binary 
sequence sequence interface |channel 
| | | 
output analog table discrete 


lookup decoder 


waveform | filter 


Figure 1.3: Breaking the source coding/decoding layer into 3 layers for a waveform 
source. The input side of the outermost layer converts the waveform into a sequence 
of samples and output side converts the recovered samples back to the waveform. The 
quantizer then converts each sample into one of a finite set of symbols, and the peer 
module recreates the sample (with some distortion). Finally the inner layer encodes 
the sequence of symbols into binary digits. 


peer output module. It is also possible in network situations for a single module at one level 
to interface with multiple modules at the next lower layer or the next higher layer. The use of 
layering is at least as important for networks as for point-to-point communications systems. The 
physical layer for networks is essentially the channel encoding/decoding layer discussed here, but 
textbooks on networks rarely discuss these physical layer issues in depth. The network control 
issues at other layers are largely separable from the physical layer communication issues stressed 
here. The reader is referred to [1], for example, for a treatment of these control issues. 


The following three sections give a fuller discussion of the components of Figure 1.1, 7.e., of the 
fundamental two layers (source coding/decoding and channel coding/decoding) of a point-to- 
point digital communication system, and finally of the interface between them. 


1.2 Communication sources 


The source might be discrete, 7.e., it might produce a sequence of discrete symbols, such as letters 
from the English or Chinese alphabet, binary symbols from a computer file, etc. Alternatively, 
the source might produce an analog waveform, such as a voice signal from a microphone, the 
output of a sensor, a video waveform, etc. Or, it might be a sequence of images such as X-rays, 
photographs, etc. 


Whatever the nature of the source, the output from the source will be modeled as a sample 
function of a random process. It is not obvious why the inputs to communication systems 
should be modeled as random, and in fact this was not appreciated before Shannon developed 
information theory in 1948. 


The study of communication before 1948 (and much of it well after 1948) was based on Fourier 
analysis; basically one studied the effect of passing sine waves through various kinds of systems 
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and components and viewed the source signal as a superposition of sine waves. Our study of 
channels will begin with this kind of analysis (often called Nyquist theory) to develop basic 
results about sampling, intersymbol interference, and bandwidth. 


Shannon’s view, however, was that if the recipient knows that a sine wave of a given frequency 
is to be communicated, why not simply regenerate it at the output rather than send it over 
a long distance? Or, if the recipient knows that a sine wave of unknown frequency is to be 
communicated, why not simply send the frequency rather than the entire waveform? 


The essence of Shannon’s viewpoint is that the set of possible source outputs, rather than any 
particular output, is of primary interest. The reason is that the communication system must be 
designed to communicate whichever one of these possible source outputs actually occurs. The 
objective of the communication system then is to transform each possible source output into a 
transmitted signal in such a way that these possible transmitted signals can be best distinguished 
at the channel output. A probability measure is needed on this set of possible source outputs 
to distinguish the typical from the atypical. This point of view drives the discussion of all 
components of communication systems throughout this text. 


1.2.1 Source coding 


The source encoder in Figure 1.1 has the function of converting the input from its original 
form into a sequence of bits. As discussed before, the major reasons for this almost universal 
conversion to a bit sequence are as follows: inexpensive digital hardware, standardized interfaces, 
layering, and the source/channel separation theorem. 


The simplest source coding techniques apply to discrete sources and simply involve representing 
each succesive source symbol by a sequence of binary digits. For example, letters from the 27- 
symbol English alphabet (including a SPACE symbol) may be encoded into 5-bit blocks. Since 
there are 32 distinct 5-bit blocks, each letter may be mapped into a distinct 5-bit block with 
a few blocks left over for control or other symbols. Similarly, upper-case letters, lower-case 
letters, and a great many special symbols may be converted into 8-bit blocks (“bytes”) using 
the standard ASCII code. 


Chapter 2 treats coding for discrete sources and generalizes the above techniques in many ways. 
For example the input symbols might first be segmented into m-tuples, which are then mapped 
into blocks of binary digits. More generally yet, the blocks of binary digits can be generalized 
into variable-length sequences of binary digits. We shall find that any given discrete source, 
characterized by its alphabet and probabilistic description, has a quantity called entropy asso- 
ciated with it. Shannon showed that this source entropy is equal to the minimum number of 
binary digits per source symbol required to map the source output into binary digits in such a 
way that the source symbols may be retrieved from the encoded sequence. 


Some discrete sources generate finite segments of symbols, such as email messages, that are 
statistically unrelated to other finite segments that might be generated at other times. Other 
discrete sources, such as the output from a digital sensor, generate a virtually unending sequence 
of symbols with a given statistical characterization. The simpler models of Chapter 2 will 
correspond to the latter type of source, but the discussion of universal source coding in Section 
2.9 is sufficiently general to cover both types of sources, and virtually any other kind of source. 


The most straightforward approach to analog source coding is called analog to digital (A/D) 
conversion. The source waveform is first sampled at a sufficiently high rate (called the “Nyquist 
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rate”). Each sample is then quantized sufficiently finely for adequate reproduction. For example, 
in standard voice telephony, the voice waveform is sampled 8000 times per second; each sample 
is then quantized into one of 256 levels and represented by an 8-bit byte. This yields a source 
coding bit rate of 64 Kbps. 


Beyond the basic objective of conversion to bits, the source encoder often has the further ob- 
jective of doing this as efficiently as possible— i.e., transmitting as few bits as possible, subject 
to the need to reconstruct the input adequately at the output. In this case source encoding is 
often called data compression. For example, modern speech coders can encode telephone-quality 
speech at bit rates of the order of 6-16 kb/s rather than 64 kb/s. 


The problems of sampling and quantization are largely separable. Chapter 3 develops the basic 
principles of quantization. As with discrete source coding, it is possible to quantize each sample 
separately, but it is frequently preferable to segment the samples into n-tuples and then quantize 
the resulting n-tuples. As shown later, it is also often preferable to view the quantizer output 
as a discrete source output and then to use the principles of Chapter 2 to encode the quantized 
symbols. This is another example of layering. 


Sampling is one of the topics in Chapter 4. The purpose of sampling is to convert the analog 
source into a sequence of real-valued numbers, ?.e., into a discrete-time, analog-amplitude source. 
There are many other ways, beyond sampling, of converting an analog source to a discrete-time 
source. A general approach, which includes sampling as a special case, is to expand the source 
waveform into an orthonormal expansion and use the coefficients of that expansion to represent 
the source output. The theory of orthonormal expansions is a major topic of Chapter 4. It 
forms the basis for the signal space approach to channel encoding/decoding. Thus Chapter 4 
provides us with the basis for dealing with waveforms both for sources and channels. 


1.3. Communication channels 


We next discuss the channel and channel coding in a generic digital communication system. 


In general, a channel is viewed as that part of the communication system between source and 
destination that is given and not under the control of the designer. Thus, to a source-code 
designer, the channel might be a digital channel with binary input and output; to a telephone- 
line modem designer, it might be a 4 KHz voice channel; to a cable modem designer, it might 
be a physical coaxial cable of up to a certain length, with certain bandwidth restrictions. 


When the channel is taken to be the physical medium, the amplifiers, antennas, lasers, etc. that 
couple the encoded waveform to the physical medium might be regarded as part of the channel 
or as as part of the channel encoder. It is more common to view these coupling devices as part 
of the channel, since their design is quite separable from that of the rest of the channel encoder. 
This, of course, is another example of layering. 


Channel encoding and decoding when the channel is the physical medium (either with or with- 
out amplifiers, antennas, lasers, etc.) is usually called (digital) modulation and demodulation 
respectively. The terminology comes from the days of analog communication where modulation 
referred to the process of combining a lowpass signal waveform with a high frequency sinusoid, 
thus placing the signal waveform in a frequency band appropriate for transmission and regu- 
latory requirements. The analog signal waveform could modulate the amplitude, frequency, or 
phase, for example, of the sinusoid, but in any case, the original waveform (in the absence of 
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noise) could be retrieved by demodulation. 


As digital communication has increasingly replaced analog communication, the modula- 
tion/demodulation terminology has remained, but now refers to the entire process of digital 
encoding and decoding. In most such cases, the binary sequence is first converted to a baseband 
waveform and the resulting baseband waveform is converted to bandpass by the same type of 
procedure used for analog modulation. As will be seen, the challenging part of this problem is 
the conversion of binary data to baseband waveforms. Nonetheless, this entire process will be 
referred to as modulation and demodulation, and the conversion of baseband to passband and 
back will be referred to as frequency conversion. 


As in the study of any type of system, a channel is usually viewed in terms of its possible inputs, 
its possible outputs, and a description of how the input affects the output. This description is 
usually probabilistic. If a channel were simply a linear time-invariant system (e.g., a filter), then 
it could be completely characterized by its impulse response or frequency response. However, 
the channels here (and channels in practice) always have an extra ingredient — noise. 


Suppose that there were no noise and a single input voltage level could be communicated exactly. 
Then, representing that voltage level by its infinite binary expansion, it would be possible in 
principle to transmit an infinite number of binary digits by transmitting a single real number. 
This is ridiculous in practice, of course, precisely because noise limits the number of bits that 
can be reliably distinguished. Again, it was Shannon, in 1948, who realized that noise provides 
the fundamental limitation to performance in communication systems. 


The most common channel model involves a waveform input X(t), an added noise waveform Z(t), 
and a waveform output Y(t) = X(t)+ Z(t) that is the sum of the input and the noise, as shown 
in Figure 1.4. Each of these waveforms are viewed as random processes. Random processes are 
studied in Chapter 7, but for now they can be viewed intuitively as waveforms selected in some 
probabilitistic way. The noise Z(t) is often modeled as white Gaussian noise (also to be studied 
and explained later). The input is usually constrained in power and bandwidth. 


Z(t) 


Noise 


Input Output 


X(t) | Y() 


Figure 1.4: An additive white Gaussian noise (AWGN) channel. 


Observe that for any channel with input X(t) and output Y(t), the noise could be defined to 
be Z(t) = Y(t) — X(t). Thus there must be something more to an additive-noise channel model 
than what is expressed in Figure 1.4. The additional required ingredient for noise to be called 
additive is that its probabilistic characterization does not depend on the input. 


In a somewhat more general model, called a linear Gaussian channel, the input waveform X (t) 
is first filtered in a linear filter with impulse response h(t), and then independent white Gaussian 
noise Z(t) is added, as shown in Figure 1.5, so that the channel output is 


Y(t) = X(t) * A(t) + Z(b), 


669 


where “x” denotes convolution. Note that Y at time t is a function of X over a range of times, 
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4.€., 


Y(t) = i, ” X(t—c)A(r) dr + Z(t) 


Z(t) 


Noise 


Input Output 


h(t) | Y(t) 


Figure 1.5: Linear Gaussian channel model. 


The linear Gaussian channel is often a good model for wireline communication and for line-of- 
sight wireless communication. When engineers, journals, or texts fail to describe the channel of 
interest, this model is a good bet. 


The linear Gaussian channel is a rather poor model for non-line-of-sight mobile communication. 
Here, multiple paths usually exist from source to destination. Mobility of the source, destination, 
or reflecting bodies can cause these paths to change in time in a way best modeled as random. 
A better model for mobile communication is to replace the time-invariant filter h(t) in Figure 
1.5 by a randomly-time-varying linear filter, H(t,7), that represents the multiple paths as they 
change in time. Here the output is given by Y(t) = f° X(t — u)H(u,t)du + Z(t). These 
randomly varying channels will be studied in Chapter 9. 


1.3.1 Channel encoding (modulation) 


The channel encoder box in Figure 1.1 has the function of mapping the binary sequence at 
the source/channel interface into a channel waveform. A particularly simple approach to this 
is called binary pulse amplitude modulation (2-PAM). Let {u1,wu2,...,} denote the incoming 
binary sequence, where each u,, is +1 (rather than the traditional 0/1). Let p(t) be a given 
elementary waveform such as a rectangular pulse or a sin(wt) function. Assuming that the binary 
digits enter at R bits per second (bps), the sequence u1,u2,... is mapped into the waveform 
a Unp(t oa By 

Even with this trivially simple modulation scheme, there are a number of interesting questions, 
such as how to choose the elementary waveform p(t) so as to satisfy frequency constraints 
and reliably detect the binary digits from the received waveform in the presence of noise and 
intersymbol interference. 


Chapter 6 develops the principles of modulation and demodulation. The simple 2-PAM scheme 
is generalized in many ways. For example, multi-level modulation first segments the incoming 
bits into m-tuples. There are M = 2™ distinct m-tuples, and in M-PAM, each m-tuple is 
mapped into a different numerical value (such as +1,+3,+5,+7 for M = 8). The sequence 
U1, U2,--- of these values is then mapped into the waveform }7,, unp(t— 73"). Note that the rate 
at which pulses are sent is now m times smaller than before, but there are 2’”” different values 
to be distinguished at the receiver for each elementary pulse. 


The modulated waveform can also be a complex baseband waveform (which is then modulated 
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up to an appropriate passband as a real waveform). In a scheme called quadrature amplitude 
modulation (QAM), the bit sequence is again segmented into m-tuples, but now there is a 
mapping from binary m-tuples to a set of MM = 2” complex numbers. The sequence wu}, U2,..., 


of outputs from this mapping is then converted to the complex waveform )/,, Unp(t — 33). 


Finally, instead of using a fixed signal pulse p(t) multiplied by a selection from M real or complex 
values, it is possible to choose M different signal pulses, p;(t),... ,pag(t). This includes frequency 
shift keying, pulse position modulation, phase modulation, and a host of other strategies. 


It is easy to think of many ways to map a sequence of binary digits into a waveform. We shall 
find that there is a simple geometric “signal-space” approach, based on the results of Chapter 
4, for looking at these various combinations in an integrated way. 


Because of the noise on the channel, the received waveform is different from the transmitted 
waveform. A major function of the demodulator is that of detection. The detector attempts 
to choose which possible input sequence is most likely to have given rise to the given received 
waveform. Chapter 7 develops the background in random processes necessary to understand this 
problem, and Chapter 8 uses the geometric signal-space approach to analyze and understand 
the detection problem. 


1.3.2 Error correction 


Frequently the error probability incurred with simple modulation and demodulation techniques 
is too high. One possible solution is to separate the channel encoder into two layers, first an 
error-correcting code, and then a simple modulator. 


As a very simple example, the bit rate into the channel encoder could be reduced by a factor 
of 3, and then each binary input could be repeated 3 times before entering the modulator. If 
at most one of the 3 binary digits coming out of the demodulator were incorrect, it could be 
corrected by majority rule at the decoder, thus reducing the error probability of the system at 
a considerable cost in data rate. 


The scheme above (repetition encoding followed by majority-rule decoding) is a very simple 
example of error-correction coding. Unfortunately, with this scheme, small error probabilities 
are achieved only at the cost of very small transmission rates. 


What Shannon showed was the very unintuitive fact that more sophisticated coding schemes can 
achieve arbitrarily low error probability at any data rate above a value known as the channel 
capacity. The channel capacity is a function of the probabilistic description of the output 
conditional on each possible input. Conversely, it is not possible to achieve low error probability 
at rates above the channel capacity. A brief proof of this channel coding theorem is given in 
Chapter 8, but readers should refer to texts on Information Theory such as [7] or [4]) for detailed 
coverage. 


The channel capacity for a bandlimited additive white Gaussian noise channel is perhaps the 
most famous result in information theory. If the input power is limited to P, the bandwidth 
limited to W, and the noise power per unit bandwidth is No, then the capacity (in bits per 


second) is 
C=W1 1+ a 
Perens Now ) 


Only in the past few years have channel coding schemes been developed that can closely approach 
this channel capacity. 
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Early uses of error-correcting codes were usually part of a two-layer system similar to that 
above, where a digital error-correcting encoder is followed by a modulator. At the receiver, 
the waveform is first demodulated into a noisy version of the encoded sequence, and then this 
noisy version is decoded by the error-correcting decoder. Current practice frequently achieves 
better performance by combining error correction coding and modulation together in coded 
modulation schemes. Whether the error correction and traditional modulation are separate 
layers or combined, the combination is generally referred to as a modulator and a device that 
does this modulation on data in one direction and demodulation in the other direction is referred 
to as a modem. 


The subject of error correction has grown over the last 50 years to the point where complex and 
lengthy textbooks are dedicated to this single topic (see, for example, [15] and [6].) This text 
provides only an introduction to error-correcting codes. 


The final topic of the text is channel encoding and decoding for wireless channels. Considerable 
attention is paid here to modeling physical wireless media. Wireless channels are subject not 
only to additive noise but also random fluctuations in the strength of multiple paths between 
transmitter and receiver. The interaction of these paths causes fading, and we study how this 
affects coding, signal selection, modulation, and detection. Wireless communication is also used 
to discuss issues such as channel measurement, and how these measurements can be used at 
input and output. Finally there is a brief case study of CDMA (code division multiple access), 
which ties together many of the topics in the text. 


1.4 Digital interface 


The interface between the source coding layer and the channel coding layer is a sequence of bits. 
However, this simple characterization does not tell the whole story. The major complicating 
factors are as follows: 


e Unequal rates: The rate at which bits leave the source encoder is often not perfectly matched 
to the rate at which bits enter the channel encoder. 


e Errors: Source decoders are usually designed to decode an exact replica of the encoded 
sequence, but the channel decoder makes occasional errors. 


e Networks: Encoded source outputs are often sent over networks, traveling serially over 
several channels; each channel in the network typically also carries the output from a number 
of different source encoders. 


The first two factors above appear both in point-to-point communication systems and in net- 
works. They are often treated in an ad hoc way in point-to-point systems, whereas they must 
be treated in a standardized way in networks. The third factor, of course, must also be treated 
in a standardized way in networks. 


The usual approach to these problems in networks is to convert the superficially simple binary 
interface above into multiple layers as illustrated in Figure 1.6 


How the layers in Figure 1.6 operate and work together is a central topic in the study of networks 
and is treated in detail in network texts such as [1]. These topics are not considered in detail 
here, except for the very brief introduction to follow and a few comments as needed later. 
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source | source TCP IP DLC channel 
input encoder input input input encoder 


channel 


source | source | TCP | IP | DLC | channel 

output | decoder output output output decoder 
Figure 1.6: The replacement of the binary interface in Figure 1.6 with 3 layers in an 
oversimplified view of the internet: There is a TCP (transport control protocol) module 
associated with each source/destination pair; this is responsible for end-to-end error 
recovery and for slowing down the source when the network becomes congested. There 
is an IP (internet protocol) module associated with each node in the network; these 
modules work together to route data through the network and to reduce congestion. 
Finally there is a DLC (data link control) module associated with each channel; this 
accomplishes rate matching and error recovery on the channel. In network terminology, 
the channel, with its encoder and decoder, is called the physical layer. 


1.4.1 Network aspects of the digital interface 


The output of the source encoder is usually segmented into packets (and in many cases, such 
as email and data files, is already segmented in this way). Each of the network layers then 
adds some overhead to these packets, adding a header in the case of TCP (transmission control 
protocol) and IP (internet protocol) and adding both a header and trailer in the case of DLC 
(data link control). Thus what enters the channel encoder is a sequence of frames, where each 
frame has the structure illustrated in Figure 1.7. 


Source encoded DLC 
packet trailer 


DLC 
header 


IP 
header 


TCP 
header 


Figure 1.7: The structure of a data frame using the layers of Figure 1.6 


These data frames, interspersed as needed by idle-fill, are strung together and the resulting bit 
stream enters the channel encoder at its synchronous bit rate. The header and trailer supplied 
by the DLC must contain the information needed for the receiving DLC to parse the received 
bit stream into frames and eliminate the idle-fill. 


The DLC also provides protection against decoding errors made by the channel decoder. Typi- 
cally this is done by using a set of 16 or 32 parity checks in the frame trailer. Each parity check 
specifies whether a given subset of bits in the frame contains an even or odd number of 1’s. Thus 
if errors occur in transmission, it is highly likely that at least one of these parity checks will fail 
in the receiving DLC. This type of DLC is used on channels that permit transmission in both 
directions. Thus when an erroneous frame is detected, it is rejected and a frame in the opposite 
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direction requests a retransmission of the erroneous frame. Thus the DLC header must contain 
information about frames traveling in both directions. For details about such protocols, see, for 
example, [1]. 


An obvious question at this point is why error correction is typically done both at the physical 
layer and at the DLC layer. Also, why is feedback (7.e., error detection and retransmission) used 
at the DLC layer and not at the physical layer? A partial answer is that if the error correction 
is omitted at one of the layers, the error probability is increased. At the same time, combining 
both procedures (with the same overall overhead) and using feedback at the physical layer can 
result in much smaller error probabilities. The two layer approach is typically used in practice 
because of standardization issues, but in very difficult communication situations, the combined 
approach can be preferable. From a tutorial standpoint, however, it is preferable to acquire a 
good understanding of channel encoding and decoding using transmission in only one direction 
before considering the added complications of feedback. 


When the receiving DLC accepts a frame, it strips off the DLC header and trailer and the 
resulting packet enters the IP layer. In the IP layer, the address in the IP header is inspected 
to determine whether the packet is at its destination or must be forwarded through another 
channel. Thus the IP layer handles routing decisions, and also sometimes the decision to drop 
a packet if the queues at that node are too long. 


When the packet finally reaches its destination, the IP layer strips off the IP header and passes 
the resulting packet with its TCP header to the TCP layer. The TCP module then goes through 
another error recovery phase* much like that in the DLC module and passes the accepted packets, 
without the TCP header, on to the destination decoder. The TCP and IP layers are also jointly 
responsible for congestion control, which ultimately requires the ability to either reduce the rate 
from sources as required or to simply drop sources that cannot be handled (witness dropped 
cell-phone calls). 


In terms of sources and channels, these extra layers simply provide a sharper understanding of 
the digital interface between source and channel. That is, source encoding still maps the source 
output into a sequence of bits, and from the source viewpoint, all these layers can simply be 
viewed as a channel to send that bit sequence reliably to the destination. 


In a similar way, the input to a channel is a sequence of bits at the channel’s synchronous input 
rate. The output is the same sequence, somewhat delayed and with occasional errors. 


Thus both source and channel have digital interfaces, and the fact that these are slightly dif- 
ferent because of the layering is in fact an advantage. The source encoding can focus solely on 
minimizing the output bit rate (perhaps with distortion and delay constraints) but can ignore 
the physical channel or channels to be used in transmission. Similarly the channel encoding can 
ignore the source and focus solely on maximizing the transmission bit rate (perhaps with delay 
and error rate constraints). 


‘Even after all these layered attempts to prevent errors, occasional errors are inevitable. Some are caught by 
human intervention, many don’t make any real difference, and a final few have consequences. C’est la vie. The 
purpose of communication engineers and network engineers is not to eliminate all errors, which is not possible, 
but rather to reduce their probability as much as practically possible. 
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1.5 Supplementary reading 


An excellent text that treats much of the material here with more detailed coverage but less 
depth is Proakis [21]. Another good general text is Wilson [34]. The classic work that introduced 
the signal space point of view in digital communication is Wozencraft and Jacobs [35]. Good 
undergraduate treatments are provided in [22], [12], and [23]. 


Readers who lack the necessary background in probability should consult [2] or [24]. More 
advanced treatments of probability are given in [8] and [25]. Feller [5] still remains the classic 
text on probability for the serious student. 


Further material on Information theory can be found, for example, in [7] and [4]. The original 
work by Shannon [27] is fascinating and surprisingly accessible. 


The field of channel coding and decoding has become an important but specialized part of 
most communication systems. We introduce coding and decoding in Chapter 8, but a separate 
treatment is required to develop the subject in depth. At M.I.T., the text here is used for the 
first of a two term sequence and the second term uses a polished set of notes by D. Forney 
[6] available on the web. Alternatively, [15] is a good choice among many texts on coding and 
decoding. 


Wireless communication is probably the major research topic in current digital communication 
work. Chapter 9 provides a substantial introduction to this topic, but a number of texts develop 
wireless communcation in much greater depth. Tse and Viswanath [32] and Goldsmith [9] are 
recommended and [33] is a good reference for spread spectrum techniques. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


Chapter 2 


Coding for Discrete Sources 


2.1 Introduction 


A general block diagram of a point-to-point digital communication system was given in Figure 
1.1. The source encoder converts the sequence of symbols from the source to a sequence of 
binary digits, preferably using as few binary digits per symbol as possible. The source decoder 
performs the inverse operation. Initially, in the spirit of source/channel separation, we ignore 
the possibility that errors are made in the channel decoder and assume that the source decoder 
operates on the source encoder output. 


We first distinguish between three important classes of sources: 


e Discrete sources 


The output of a discrete source is a sequence of symbols from a known discrete alphabet %. 
This alphabet could be the alphanumeric characters, the characters on a computer keyboard, 
English letters, Chinese characters, the symbols in sheet music (arranged in some systematic 
fashion), binary digits, etc. 


The discrete alphabets in this chapter are assumed to contain a finite set of symbols.! 


It is often convenient to view the sequence of symbols as occurring at some fixed rate in 
time, but there is no need to bring time into the picture (for example, the source sequence 
might reside in a computer file and the encoding can be done off-line). 


This chapter focuses on source coding and decoding for discrete sources.” Supplementary 
references for source coding are Chapter 3 of [7] and Chapter 5 of [4]. A more elementary 
partial treatment is in Sections 4.1-4.3 of [22]. 


e Analog waveform sources 


The output of an analog source, in the simplest case, is an analog real waveform, repre- 
senting, for example, a speech waveform. The word analog is used to emphasize that the 
waveform can be arbitrary and is not restricted to taking on amplitudes from some discrete 
set of values. 


‘A set is usually defined to be discrete if it includes either a finite or countably infinite number of members. 
The countably infinite case does not extend the basic theory of source coding in any important way, but it is 
occasionally useful in looking at limiting cases, which will be discussed as they arise. 


15 
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It is also useful to consider analog waveform sources with outputs that are complex functions 
of time; both real and complex waveform sources are discussed later. 


More generally, the output of an analog source might be an image (represented as an inten- 
sity function of horizontal/vertical location) or video (represented as an intensity function 
of horizontal/vertical location and time). For simplicity, we restrict our attention to analog 
waveforms, mapping a single real variable, time, into a real or complex-valued intensity. 


e Discrete-time sources with analog values (analog sequence sources) 


These sources are halfway between discrete and analog sources. The source output is a 
sequence of real numbers (or perhaps complex numbers). Encoding such a source is of 
interest in its own right, but is of interest primarily as a subproblem in encoding analog 
sources. That is, analog waveform sources are almost invariably encoded by first either 
sampling the analog waveform or representing it by the coefficients in a series expansion. 
Either way, the result is a sequence of numbers, which is then encoded. 


There are many differences between discrete sources and the latter two types of analog sources. 
The most important is that a discrete source can be, and almost always is, encoded in such a 
way that the source output can be uniquely retrieved from the encoded string of binary digits. 
Such codes are called uniquely decodable?. On the other hand, for analog sources, there is 
usually no way to map the source values to a bit sequence such that the source values are 
uniquely decodable. For example, an infinite number of binary digits is required for the exact 
specification of an arbitrary real number between 0 and 1. Thus, some sort of quantization is 
necessary for these analog values, and this introduces distortion. Source encoding for analog 
sources thus involves a trade-off between the bit rate and the amount of distortion. 


Analog sequence sources are almost invariably encoded by first quantizing each element of the 
sequence (or more generally each successive n-tuple of sequence elements) into one of a finite 
set of symbols. This symbol sequence is a discrete sequence which can then be encoded into a 
binary sequence. 


Figure 2.1 summarizes this layered view of analog and discrete source coding. As illustrated, 
discrete source coding is both an important subject in its own right for encoding text-like sources, 
but is also the inner layer in the encoding of analog sequences and waveforms. 


The remainder of this chapter discusses source coding for discrete sources. The following chapter 
treats source coding for analog sequences and the fourth chapter treats waveform sources. 


2.2 Fixed-length codes for discrete sources 


The simplest approach to encoding a discrete source into binary digits is to create a code C that 
maps each symbol x of the alphabet ¥ into a distinct codeword C(x), where C(x) is a block of 
binary digits. Each such block is restricted to have the same block length L, which is why the 
code is called a fixed-length code. 


?Uniquely-decodable codes are sometimes called noiseless codes in elementary treatments. Uniquely decodable 
captures both the intuition and the precise meaning far better than noiseless. Unique decodability is defined 
shortly. 
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input d discrete 
—— + sampler |+4++ quantizer |1+ 
waveform P | 4 | encoder | 
analog symbol binary binary 
sequence sequence interface |channel 
| | | 
output analog table discrete 


lookup decoder 


waveform | filter 


Figure 2.1: Discrete sources require only the inner layer above, whereas the inner two 
layers are used for analog sequences and all three layers are used for waveforms sources. 


For example, if the alphabet ¥ consists of the 7 symbols {a,b,c,d,e, f,g}, then the following 
fixed-length code of block length L = 3 could be used. 


C(a) = 000 

C(b)= 001 

Cia) = -010 

C(d)= O11 

C(e)= 100 

C(f) = 101 

C(g)= 110. 
The source output, 71, 272,..., would then be encoded into the encoded output C(#1)C(x2)... 
and thus the encoded output contains DL bits per source symbol. For the above example the 
source sequence bad... would be encoded into 001000011.... Note that the output bits are 


simply run together (or, more technically, concatenated). 


There are 2” different combinations of values for a block of L bits. Thus, if the number of 
symbols in the source alphabet, M = ||, satisfies M < 2", then a different binary L-tuple 
may be assigned to each symbol. Assuming that the decoder knows where the beginning of the 
encoded sequence is, the decoder can segment the sequence into L bit blocks and then decode 
each block into the corresponding source symbol. 


In summary, if the source alphabet has size M, then this coding method requires L = [log, M] 
bits to encode each source symbol, where [w] denotes the smallest integer greater than or equal 
to the real number w. Thus log, M < L < logsM+1. The lower bound, logy M, can be 
achieved with equality if and only if M is a power of 2. 


A technique to be used repeatedly is that of first segmenting the sequence of source symbols 
into successive blocks of n source symbols at a time. Given an alphabet ¥ of M symbols, there 
are M” possible n-tuples. These M” n-tuples are regarded as the elements of a super-alphabet. 
Each n-tuple can be encoded rather than encoding the original symbols. Using fixed-length 
source coding on these n-tuples, each source n-tuple can be encoded into L = [log, M"| bits. 
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The rate L = L/n of encoded bits per original source symbol is then bounded by 


n 
[re log, M"| > nlog, M eye 
n n 
= l M” ] M 1 1 
Z — lee. ] i n(logy M) + =e. 
n n n 


Thus log, M < L < logy M+ 1 and by letting n become sufficiently large, the average number 
of coded bits per source symbol can be made arbitrarily close to logy M, regardless of whether 
M is a power of 2. 


Some remarks: 


e This simple scheme to make L arbitrarily close to logy M is of greater theoretical interest 
than practical interest. As shown later, log, M is the minimum possible binary rate for 
uniquely-decodable source coding if the source symbols are independent and equiprobable. 
Thus this scheme asymptotically approaches this minimum. 


e This result begins to hint at why measures of information are logarithmic in the alphabet 
size.? The logarithm is usually taken to the base 2 in discussions of binary codes. Henceforth 
logn means “logs n.” 


e This method is nonprobabilistic; it takes no account of whether some symbols occur more 
frequently than others, and it works robustly regardless of the symbol frequencies. But if 
it is known that some symbols occur more frequently than others, then the rate L of coded 
bits per source symbol can be reduced by assigning shorter bit sequences to more common 
symbols in a variable-length source code. This will be our next topic. 


2.3. Variable-length codes for discrete sources 


The motivation for using variable-length encoding on discrete sources is the intuition that data 
compression can be achieved by mapping more probable symbols into shorter bit sequences, 
and less likely symbols into longer bit sequences. This intuition was used in the Morse code of 
old-time telegraphy in which letters were mapped into strings of dots and dashes, using shorter 
strings for common letters and longer strings for less common letters. 


A variable-length code C maps each source symbol a; in a source alphabet V = {a1,... , ays} to 
a binary string C(a;), called a codeword. The number of bits in C(a;) is called the length I(a;) of 
C(a;). For example, a variable-length code for the alphabet 4 = {a,b,c} and its lengths might 


be given by 
Cia) = 0 l(a) =1 
C(b) => 10 ib) =2 
Cle. 0L ee 


Successive codewords of a variable-length code are assumed to be transmitted as a continuing 
sequence of bits, with no demarcations of codeword boundaries (i.e., no commas or spaces). The 


°The notion that information can be viewed as a logarithm of a number of possibilities was first suggested by 
Hartley [11] in 1927. 
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source decoder, given an original starting point, must determine where the codeword boundaries 
are; this is called parsing. 


A potential system issue with variable-length coding is the requirement for buffering. If source 
symbols arrive at a fixed rate and the encoded bit sequence must be transmitted at a fixed bit 
rate, then a buffer must be provided between input and output. This requires some sort of 
recognizable ‘fill’ to be transmitted when the buffer is empty and the possibility of lost data 
when the buffer is full. There are many similar system issues, including occasional errors on 
the channel, initial synchronization, terminal synchronization, etc. Many of these issues are 
discussed later, but they are more easily understood after the more fundamental issues are 
discussed. 


2.3.1 Unique decodability 


The major property that is usually required from any variable-length code is that of unique 
decodability. This essentially means that for any sequence of source symbols, that sequence can 
be reconstructed unambiguously from the encoded bit sequence. Here initial synchronization is 
assumed: the source decoder knows which is the first bit in the coded bit sequence. Note that 
without initial synchronization, not even a fixed-length code can be uniquely decoded. 


Clearly, unique decodability requires that C(a;) # C(a;) for each i 4 j. More than that, however, 
it requires that strings* of encoded symbols be distinguishable. The following definition says 
this precisely: 


Definition 2.3.1. A code C for a discrete source is uniquely decodable if, for any string 


of source symbols, say 21,22,...,2%n, the concatenation® of the corresponding codewords, 
C(a1)C(x2)-+-+C(ap), differs from the concatenation of the codewords C(x‘/,)C(a4)---C(ai,,) for 
any other string v,75,...,2/,, of source symbols. 


In other words, C is uniquely decodable if all concatenations of codewords are distinct. 


Remember that there are no commas or spaces between codewords; the source decoder has 
to determine the codeword boundaries from the received sequence of bits. (If commas were 
inserted, the code would be ternary rather than binary.) 


For example, the above code C for the alphabet Y = {a,b,c} is soon shown to be uniquely 
decodable. However, the code C’ defined by 


(a) =. 0 
Cbs 
Clo)'=- 01 


is not uniquely decodable, even though the codewords are all different. If the source decoder 
observes 01, it cannot determine whether the source emitted (ab) or (c). 


Note that the property of unique decodability depends only on the set of codewords and not 
on the mapping from symbols to codewords. Thus we can refer interchangeably to uniquely- 
decodable codes and uniquely-decodable codeword sets. 


4A string of symbols is an n-tuple of symbols for any finite n. A sequence of symbols is an n-tuple in the limit 
n — oo, although the word sequence is also used when the length might be either finite or infinite. 
5>The concatenation of two strings, say wi --- uy; and vy --- vy is the combined string ur +--+ wev1 +++ Uy. 
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2.3.2 Prefix-free codes for discrete sources 


Decoding the output from a uniquely-decodable code, and even determining whether it is 
uniquely decodable, can be quite complicated. However, there is a simple class of uniquely- 
decodable codes called prefix-free codes. As shown later, these have the following advantages 
over other uniquely-decodable codes:® 


e If a uniquely-decodable code exists with a certain set of codeword lengths, then a prefix-free 
code can easily be constructed with the same set of lengths. 


e The decoder can decode each codeword of a prefix-free code immediately on the arrival of 
the last bit in that codeword. 


e Given a probability distribution on the source symbols, it is easy to construct a prefix-free 
code of minimum expected length. 


Definition 2.3.2. A prefix of a string y, --- y is any initial substring y,--- yy, l/ < 1 of that 
string. The prefix is proper if l/ < 1. A code is prefix-free if no codeword is a prefix of any other 
codeword. 


For example, the code C with codewords 0,10, and 11 is prefix-free, but the code C’ with 
codewords 0, 1, and 01 is not. Every fixed-length code with distinct codewords is prefix-free. 


We will now show that every prefix-free code is uniquely decodable. The proof is constructive, 
and shows how the decoder can uniquely determine the codeword boundaries. 


Given a prefix-free code C, a corresponding binary code tree can be defined which grows from a 
root on the left to leaves on the right representing codewords. Each branch is labelled 0 or 1 
and each node represents the binary string corresponding to the branch labels from the root to 
that node. The tree is extended just enough to include each codeword. That is, each node in 
the tree is either a codeword or proper prefix of a codeword (see Figure 2.2). 


Figure 2.2: The binary code tree for a prefix-free code. 


The prefix-free condition ensures that each codeword corresponds to a leaf node (i.e., a node 
with no adjoining branches going to the right). Each intermediate node (i.e., nodes having one 
or more adjoining branches going to the right) is a prefix of some codeword reached by traveling 
right from the intermediate node. 


°With all the advantages of prefix-free codes, it is difficult to understand why the more general class is even 
discussed. This will become clearer much later. 
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The tree of Figure 2.2 has an intermediate node, 10, with only one right-going branch. This shows 
that the codeword for c could be shortened to 10 without destroying the prefix-free property. 
This is shown in Figure 2.3. 


1 b 
a—0O 
1 Q b— 11 
0 G c— 10 


a 


Figure 2.3: A code with shorter lengths than that of Figure 2.2. 


A prefix-free code will be called full if no new codeword can be added without destroying the 
prefix-free property. As just seen, a prefix-free code is also full if no codeword can be shortened 
without destroying the prefix-free property. Thus the code of Figure 2.2 is not full, but that of 
Figure 2.3 is. 


To see why the prefix-free condition guarantees unique decodability, consider the tree for the 
concatenation of two codewords. This is illustrated in Figure 2.4 for the code of Figure 2.3. 
This new tree has been formed simply by grafting a copy of the original tree onto each of the 
leaves of the original tree. Each concatenation of two codewords thus lies on a different node 
of the tree and also differs from each single codeword. One can imagine grafting further trees 
onto the leaves of Figure 2.4 to obtain a tree representing still more codewords concatenated 
together. Again all concatenations of code words lie on distinct nodes, and thus correspond to 
distinct binary strings. 


bb 


bc aa — 00 
b ab > O11 
c ac — 010 


CC ba — 110 
bb > 1111 
bc — 1110 
ca — 100 


cb — 1011 
cc — 1010 


Figure 2.4: Binary code tree for two codewords; upward branches represent 1’s. 


An alternative way to see that prefix-free codes are uniquely decodable is to look at the codeword 
parsing problem from the viewpoint of the source decoder. Given the encoded binary string for 
any strong of source symbols, the source decoder can decode the first symbol simply by reading 
the string from left to right and following the corresponding path in the code tree until it reaches 
a leaf, which must correspond to the first codeword by the prefix-free property. After stripping 
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off the first codeword, the remaining binary string is again a string of codewords, so the source 
decoder can find the second codeword in the same way, and so on ad infinitum. 


For example, suppose a source decoder for the code of Figure 2.3 decodes the sequence 
1010011---. Proceeding through the tree from the left, it finds that 1 is not a codeword, 
but that 10 is the codeword for c. Thus c is decoded as the first symbol of the source output, 
leaving the string 10011---. Then c is decoded as the next symbol, leaving 011---, which is 
decoded into a and then 8, and so forth. 


This proof also shows that prefix-free codes can be decoded with no delay. As soon as the final bit 
of a codeword is received at the decoder, the codeword can be recognized and decoded without 
waiting for additional bits. For this reason, prefix-free codes are sometimes called instantaneous 
codes. 


It has been shown that all prefix-free codes are uniquely decodable. The converse is not true, 
as shown by the following code: 


Cla) = 0 
Co) =: 01 
C(c)= O11 


An encoded sequence for this code can be uniquely parsed by recognizing 0 as the beginning of 
each new code word. A different type of example is given in Exercise 2.6. 


With variable-length codes, if there are errors in data transmission, then the source decoder 
may lose codeword boundary synchronization and may make more than one symbol error. It is 
therefore important to study the synchronization properties of variable-length codes. For exam- 
ple, the prefix-free code {0,10,110,1110,11110} is instantaneously self-synchronizing, because 
every 0 occurs at the end of a codeword. The shorter prefix-free code {0, 10,110, 1110, 1111} is 
probabilistically self-synchronizing; again, any observed 0 occurs at the end of a codeword, but 
since there may be a sequence of 1111 codewords of unlimited length, the length of time before 
resynchronization is a random variable. These questions are not pursued further here. 


2.3.3 The Kraft inequality for prefix-free codes 


The Kraft inequality [17] is a condition determining whether it is possible to construct a prefix- 
free code for a given discrete source alphabet ¥ = {a1,... ,a,¢} with a given set of codeword 
lengths {U(a;);1 <j < M}. 


Theorem 2.3.1 (Kraft inequality for prefix-free codes). Every prefix-free code for an al- 
phabet X = {a1,... ,ay} with codeword lengths {l(a;);1 <7 < M} satisfies 


M 
Scots) <1, (2.1) 
j=l 


Conversely, if (2.1) is satisfied, then a prefix-free code with lengths {l(aj);1 <j < M} exists. 


Moreover, every full prefix-free code satisfies (2.1) with equality and every non-full prefix-free 
code satisfies it with strict inequality. 


For example, this theorem implies that there exists a full prefix-free code with codeword lengths 
{1, 2,2} (two such examples have already been given), but there exists no prefix-free code with 
codeword lengths {1,1, 2}. 
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Before proving the theorem, we show how to represent codewords as base 2 expansions (the 
base 2 analog of base 10 decimals) in the binary number system. After understanding this 
representation, the theorem will be almost obvious. The base 2 expansion .y,yo--- yy represents 
the rational number eae Ym2—'. For example, .011 represents 1/4 + 1/8. 


Ordinary decimals with / digits are frequently used to indicate an approximation of a real number 
to | places of accuracy. Here, in the same way, the base 2 expansion .y|y2---y is viewed as 
‘covering’ the interval’ Boome Une. Sh ca Ym2—™ + 27"). This interval has size 2~' and 
includes all numbers whose base 2 expansions start with -y,...y. 

In this way, any codeword C(a;) of length J is represented by a rational number in the interval 
[0, 1) and covers an interval of size 2~! which includes all strings that contain C(a;) as a prefix 
(see Figure 2.3). The proof of the theorem follows: 


1.0 — 


Interval [1/2, 1) 
| en cae Gee 


| Interval [1/4, 1/2) 


01 — 01 + 
| Interval [0, 1/4) 


00 — .00 L 


Figure 2.5: Base 2 expansion numbers and intervals representing codewords. The 
codewords represented above are (00, 01, and 1). 


Proof: First, assume that C is a prefix-free code with codeword lengths {I(a;), 1 < 7 < M}. 
For any distinct a; and a; in 4, it was shown above that the base 2 expansion corresponding to 
C(a;) cannot lie in the interval corresponding to C(a;) since C(a;) is not a prefix of C(a;). Thus 
the lower end of the interval corresponding to any codeword C(a;) cannot lie in the interval 
corresponding to any other codeword. Now, if two of these intervals intersect, then the lower 
end of one of them must lie in the other, which is impossible. Thus the two intervals must be 
disjoint and thus the set of all intervals associated with the codewords are disjoint. Since all 
these intervals are contained in the interval [0,1) and the size of the interval corresponding to 
C(a;) is 2-4), (2.1) is established. 

Next note that if (2.1) is satisfied with strict inequality, then some interval exists in [0,1) that 
does not intersect any codeword interval; thus another codeword can be ‘placed’ in this interval 
and the code is not full. If (2.1) is satisfied with equality, then the intervals fill up [0,1). In this 
case no additional code word can be added and the code is full. 


Finally we show that a prefix-free code can be constructed from any desired set of codeword 
lengths {l(a;), 1 < 7 < M} for which (2.1) is satisfied. Put the set of lengths in nondecreasing 
order, Jy < Jog <-+- < Jy and let uy,... , ugg be the real numbers corresponding to the codewords 
in the construction to be described. The construction is quite simple: u; = 0, and for all 


"Brackets and parentheses, respectively, are used to indicate closed and open boundaries; thus the interval 
[a,b) means the set of real numbers wu such that a <u < b. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


24 CHAPTER 2. CODING FOR DISCRETE SOURCES 


ZI<j <M, 


Uj — yo (2.2) 


Each term on the right is an integer multiple of 2~"’, so uj is also an integer multiple of 2-45, From 
(2.1), uj < 1, so wu; can be represented by a base 2 expansion with |; places. The corresponding 
codeword of length /; can be added to the code while preserving prefix-freedom (see Figure 2.6). 


Uji 
0.001 + | - 
0.001 f 0.111 C(5) = 111 
; 0.11 C(4) = 110 

0.01 

+ 0.1 C(3) = 10 
0.01 | 

+ 0.01 c(2) =01 
0.01 | 

“s 0 C(1) = 00 


Figure 2.6: Construction of codewords for the set of lengths {2,2,2,3,3}. C(i) is formed 
from u; by representing u,; to 1; places. 


Some final remarks on the Kraft inequality: 


e Just because a code has lengths that satisfy (2.1), it does not follow that the code is prefix- 
free, or even uniquely decodable. 


e Exercise 2.11 shows that Theorem 2.3.1 also holds for all uniquely-decodable codes— i.e., 
there exists a uniquely-decodable code with codeword lengths {I(a;), 1 <j < M} if and 
only if (2.1) holds. This will imply that if a uniquely-decodable code exists with a certain 
set of codeword lengths, then a prefix-free code exists with the same set of lengths. So why 
use any code other than a prefix-free code? 


2.4 Probability models for discrete sources 


It was shown above that prefix-free codes exist for any set of codeword lengths satisfying the 
Kraft inequality. When does it desirable to use one of these codes?— t.e., when is the expected 
number of coded bits per source symbol less than log M and why is the expected number of 
coded bits per source symbol the primary parameter of importance? 


This question cannot be answered without a probabilistic model for the source. For example, 
the M = 4 prefix-free set of codewords {0,10,110,111} has an expected length of 2.25 > 
2 = log M if the source symbols are equiprobable, but if the source symbol probabilities are 
{1/2, 1/4, 1/8, 1/8}, then the expected length is 1.75 < 2. 


The discrete sources that one meets in applications usually have very complex statistics. For 
example, consider trying to compress email messages. In typical English text, some letters such 
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as e and o occur far more frequently than q, x, and z. Moreover, the letters are not independent; 
for example h is often preceded by t, and q is almost always followed by u. Next, some strings 
of letters are words, while others are not; those that are not have probability near 0 (if in 
fact the text is correct English). Over longer intervals, English has grammatical and semantic 
constraints, and over still longer intervals, such as over multiple email messages, there are still 
further constraints. 


It should be clear therefore that trying to find an accurate probabilistic model of a real-world 
discrete source is not going to be a productive use of our time. An alternative approach, which 
has turned out to be very productive, is to start out by trying to understand the encoding of 
“toy” sources with very simple probabilistic models. After studying such toy sources, it will 
be shown how to generalize to source models with more and more general structure, until, 
presto, real sources can be largely understood even without good stochastic models. This is a 
good example of a problem where having the patience to look carefully at simple and perhaps 
unrealistic models pays off handsomely in the end. 


The type of toy source that will now be analyzed in some detail is called a discrete memoryless 
source. 


2.4.1 Discrete memoryless sources 


A discrete memoryless source (DMS) is defined by the following properties: 


e The source output is an unending sequence, X1, X2, X3,..., of randomly selected symbols 
from a finite set ¥ = {a1,a2,... , ays}, called the source alphabet. 
e Each source output X1, Xo,... is selected from ¥ using the same probability mass function 


(pmf) {px(a1),-.. ,px(am)}. Assume that px(a;) > 0 for all j7, 1 < 7 < M, since there is 
no reason to assign a code word to a symbol of zero probability and no reason to model a 
discrete source as containing impossible symbols. 


e Each source output X;, is statistically independent of the previous outputs X1,...,X,%_—1. 


The randomly chosen symbols coming out of the source are called random symbols. They are 
very much like random variables except that they may take on nonnumeric values. Thus, if 
X denotes the result of a fair coin toss, then it can be modeled as a random symbol that 
takes values in the set {HEADS, TaILs} with equal probability. Note that if X is a nonnumeric 
random symbol, then it makes no sense to talk about its expected value. However, the notion 
of statistical independence between random symbols is the same as that for random variables, 
i.e., the event that X; is any given element of V is independent of the events corresponding to 
the values of the other random symbols. 


The word memoryless in the definition refers to the statistical independence between different 
random symbols, 7.e., each variable is chosen with no memory of how the previous random 
symbols were chosen. In other words, the source symbol sequence is independent and identically 
distributed (iid).° 


In summary, a DMS is a semi-infinite iid sequence of random symbols 


Xi, Xo, X3, are 


8Do not confuse this notion of memorylessness with any non-probabalistic notion in system theory. 
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each drawn from the finite set V, each element of which has positive probability. 


A sequence of independent tosses of a biased coin is one example of a DMS. The sequence of 
symbols drawn (with replacement) in a Scrabble™ game is another. The reason for studying 
these sources is that they provide the tools for studying more realistic sources. 


2.5 Minimum L for prefix-free codes 


The Kraft inequality determines which sets of codeword lengths are possible for prefix-free codes. 
Given a discrete memoryless source (DMS), we want to determine what set of codeword lengths 
can be used to minimize the expected length of a prefix-free code for that DMS. That is, we 
want to minimize the expected length subject to the Kraft inequality. 


Suppose a set of lengths /(a;),... ,l(aaz) (subject to the Kraft inequality) is chosen for encoding 
each symbol into a prefix-free codeword. Define L(X) (or more briefly L) as a random variable 
representing the codeword length for the randomly selected source symbol. The expected value 
of L for the given code is then given by 


L 


f 
ma 
& 

fl 

Ms 
a 
wS 
3 
5 

a 
GS 


We want to find Lyin, which is defined as the minimum value of L over all sets of codeword 
lengths satisfying the Kraft inequality. 


Before finding Lmin, we explain why this quantity is of interest. The number of bits resulting 
from using the above code to encode a long block X = (Xj, X2,...,Xn) of symbols is S, = 
L(X1) + L(X2) +---+ L(X,,). This is a sum of n iid random variables (rv’s), and the law of 
large numbers, which is discussed in Section 2.7.1, implies that S,,/n, the number of bits per 
symbol in this long block, is very close to L with probability very close to 1. In other words, L 
is essentially the rate (in bits per source symbol) at which bits come out of the source encoder. 
This motivates the objective of finding Lyin and later of finding codes that achieve the minimum. 


Before proceeding further, we simplify our notation. We have been carrying along a completely 
arbitrary finite alphabet ¥ = {a1,...,aac} of size M = ||, but this problem (along with 
most source coding problems) involves only the probabilities of the M symbols and not their 
names. Thus define the source alphabet to be {1,2,... , M@}, denote the symbol probabilities by 
p1,--. , pm, and denote the corresponding codeword lengths by 11,... , Jag. The expected length 
of a code is then 


M 
Ai = S- Lp; 
j=l 


Mathematically, the problem of finding Lyin is that of minimizing L over all sets of integer 
lengths Jj,... ,l,yg subject to the Kraft inequality: 


M 
Lomin = min pjlj . (2.3) 
1 


Tisha 2 5<1 = 
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2.5.1 Lagrange multiplier solution for the minimum L 


The minimization in (2.3) is over a function of M variables, l1,... ,ljz, subject to constraints 
on those variables. Initially, consider a simpler problem where there are no integer constraint 
on the 1;. This simpler problem is then to minimize 5° Pili over all real values of [1,... , lay 
subject to )> j 2-45 <1. The resulting minimum is called Lyyj,(noninteger). 

Since the allowed values for the lengths in this minimization include integer lengths, it is clear 
that Lmin(noninteger) < Lmin. This noninteger minimization will provide a number of important 
insights about the problem, so its usefulness extends beyond just providing a lower bound on 
Emin: 

Note first that the minimum of )> j ljp; subject to >. 2-45 <1 must occur when the constraint 
is satisfied with equality, for otherwise, one of the 1; could be reduced, thus reducing >, pjlj 
without violating the constraint. Thus the problem is to minimize )), pjl; subject to >), o-= 
1. 

Problems of this type are often solved by using a Lagrange multiplier. The idea is to replace the 
minimization of one function, subject to a constraint on another function, by the minimization 
of a linear combination of the two functions, in this case the minimization of 


So ply +A 50975, (2.4) 
j j 


If the method works, the expression can be minimized for each choice of \ (called a Lagrange mul- 
tiplier); A can then be chosen so that the optimizing choice of 1;,... ,l,¢ satisfies the constraint. 
The minimizing value of (2.4) is then a pjl; + A. This choice of l,,... ,l;¢ minimizes the orig- 


inal constrained optimization, since for any [{,... , 14, that satisfies the constraint 5~ ; g-4 = dL, 
the expression in (2.4) is }), pjl; + A, which must be greater than or equal to )?, pjlj + A. 


We can attempt? to minimize (2.4) simply by setting the derivitive with respect to each lL; equal 
to 0. This yields 


pj —A(n2)2-5 =0; 1<j5 <M. (2.5) 


Thus 274 = p;/(Aln2). Since >2; Pj = 1, A must be equal to 1/In2 in order to satisfy the 
constraint 5° : 2-4 = 1, Then 2-4 = pj, or equivalently 1; = —logp;. It will be shown shortly 
that this stationary point actually achieves a minimum. Substituting this solution into (2.3), 


M 
Lmin(noninteger) = — Sop; log p;. (2.6) 
j=l 


The quantity on the right side of (2.6) is called the entropy!° of X, and denoted as H[X]. Thus 


H[X] = — 5° py log pj. 
j 


© There are well-known rules for when the Lagrange multiplier method works and when it can be solved simply 
by finding a stationary point. The present problem is so simple, however, that this machinery is unnecessary. 

'Note that X is a random symbol and carries with it all of the accompanying baggage, including a pmf. 
The entropy H[X] is a numerical function of the random symbol including that pmf; in the same way E[L] is a 
numerical function of the rv L. Both H[X] and E[L] are expected values of particular rv’s. In distinction, L(X) 
above is an rv in its own right; it is based on some function I(x) mapping ¥ — R and takes the sample value I(x) 
for all sample points such that X = x. 
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In summary, the entropy H[X] is a lower bound to L for prefix-free codes and this lower bound is 
achieved when 1; = — log p; for each 7. The bound was derived by ignoring the integer constraint, 
and can be met only if —logp; is an integer for each j; i.e., if each p; is a power of 2. 


2.5.2 Entropy bounds on L 


We now return to the problem of minimizing LZ with an integer constraint on lengths. The 
following theorem both establishes the correctness of the previous non-integer optimization and 
provides an upper bound on Lyin. 


Theorem 2.5.1 (Entropy bounds for prefix-free codes). Let X be a discrete random 
symbol with symbol probabilities p,,...,pi. Let Lyin be the minimum expected codeword length 
over all prefix-free codes for X. Then 


H[X] < Emin < H[X] +1. bit/symbol. (2.7) 
Furthermore, Lyin = H[X] if and only if each probability p; is an integer power of 2. 


Proof: It is first shown that H[X] < L for all prefix-free codes. Let 1,,... ,l,g be the codeword 
lengths of an arbitrary prefix-free code. Then 


M 1 M 9 
H[X|]-—L = 2 Pile 5 = Pils = we: 
j=l j=l j=l 


-l; 
o) 
Pj 


(2.8) 


where log 2~'5 has been substituted for aly: 


We now use the very useful inequality Inu < u—1, or equivalently log u < (log e)(u — 1), which 


is illustrated in Figure 2.7. Note that equality holds only at the point u = 1. 
u-1 
Inu 
1 U 


Figure 2.7: The inequality Inu < u—1. The inequality is strict except at u = 1. 


Substituting this inequality in (2.8), 


Pj 


7 M 6 M M 
HIX]-Z < (loge) 9p; (=*-1) = doge) (2%- Lvs} so G9 
2, ) ) 


where the Kraft inequality and }); pj; = 1 has been used. This establishes the left side of (2.7). 
The inequality in (2.9) is strict unless 2~"/p; = 1, or equivalently 1; = —log pj, for all j. For 
integer /;, this can be satisfied with equality if and only if p; is an integer power of 2 for all 7. For 
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arbitrary real values of 1;, this proves that (2.5) minimizes (2.3) without the integer constraint, 
thus verifying (2.6.) 


To complete the proof, it will be shown that a prefix-free code exists with L < H[X]+1. Choose 
the codeword lengths to be 


where the ceiling notation [u] denotes the smallest integer less than or equal to u. With this 
choice, 


—logp; <1; < —logp; +1. (2.10) 


Since the left side of (2.10) is equivalent to 2~"’ < p,, the Kraft inequality is satisfied: 
Te <Dy=t 
J J 


Thus a prefix-free code exists with the above lengths. From the right side of (2.10), the expected 
codeword length of this code is upperbounded by 


D= pil < doi (— log p; + 1) = H[X] +1. 
J J 


Since Dmin < L, Lmin < H[X]+ 1, completing the proof. 


Both the proof above and the noninteger minimization in (2.6) suggest that the optimal length 
of a codeword for a source symbol of probability p; should be approximately —logp;. This is 
not quite true, because, for example, if M = 2 and p; = 277°, py = 1-279, then — log p; = 20, 
but the optimal /; is 1. However, the last part of the above proof shows that if each 1; is chosen 
as an integer approximation to —log p;, then L is at worst within one bit of H[X]. 


For sources with a small number of symbols, the upper bound in the theorem appears to be too 
loose to have any value. When these same arguments are applied later to long blocks of source 
symbols, however, the theorem leads directly to the source coding theorem. 


2.5.3. Huffman’s algorithm for optimal source codes 


In the very early days of information theory, a number of heuristic algorithms were suggested 
for choosing codeword lengths 1; to approximate —logp;. Both Claude Shannon and Robert 
Fano had suggested such heuristic algorithms by 1948. It was conjectured at that time that, 
since this was an integer optimization problem, its optimal solution would be quite difficult. 
It was quite a surprise therefore when David Huffman [13] came up with a very simple and 
straightforward algorithm for constructing optimal (in the sense of minimal L) prefix-free codes. 
Huffman developed the algorithm in 1950 as a term paper in Robert Fano’s information theory 
class at MIT. 


Huffman’s trick, in today’s jargon, was to “think outside the box.” He ignored the Kraft inequal- 
ity, and looked at the binary code tree to establish properties that an optimal prefix-free code 
should have. After discovering a few simple properties, he realized that they led to a simple 
recursive procedure for constructing an optimal code. 
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1 
1 cu) p, = 0.6 With two symbols, the 
0 po = 0.4 optimal codeword 
C(2) lengths are 1 and 1. 
2c) 
l 0 ieee With three symbols, the 
C(3) p2 = 0.3 optimal lengths are 1, 2, 2. 
@) p3 = 0.1 The least likely symbols are 
C(1) assigned words of length 2. 


Figure 2.8: Some simple optimal codes. 


The simple examples in Figure 2.8 illustrate some key properties of optimal codes. After stating 
these properties precisely, the Huffman algorithm will be almost obvious. 


The property of the length assignments in the three-word example above can be generalized as 
follows: the longer the codeword, the less probable the corresponding symbol must be. More 
precisely: 


Lemma 2.5.1. Optimal codes have the property that if pj > pj, then 1; < lj. 


Proof: Assume to the contrary that a code has p; > p; and 1; > 1;. The terms involving symbols 
i and j in L are pjl; + pjl;. If the two code words are interchanged, thus interchanging 1; and 1,, 
this sum decreases, 7.e., 


(pili tpjlj) — (piljt+pjli) = (pi — pjy)(, — lj) > 0. 


Thus L decreases, so any code with p; > p; and 1; > 1; is nonoptimal. 


An even simpler property of an optimal code is as follows: 
Lemma 2.5.2. Optimal prefiz-free codes have the property that the associated code tree is full. 
Proof: If the tree is not full, then a codeword length could be reduced (see Figures 2.2 and 
2.3); 


Define the sibling of a codeword as the binary string that differs from the codeword in only the 
final digit. A sibling in a full code tree can be either a codeword or an intermediate node of the 


tree. 


Lemma 2.5.3. Optimal prefix-free codes have the property that, for each of the longest code- 
words in the code, the sibling of that codeword is another longest codeword. 


Proof: A sibling of a codeword of maximal length cannot be a prefix of a longer codeword. 
Since it cannot be an intermediate node of the tree, it must be a codeword. 


For notational convenience, assume that the M = || symbols in the alphabet are ordered so 
that pi > po >--- > pm. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


2.5. MINIMUM L FOR PREFIX-FREE CODES 31 


Lemma 2.5.4. Let X be a random symbol with a pmf satisfying py > po > --: > py. There 
is an optimal prefiz-free code for X in which the codewords for M—1 and M are siblings and 
have maximal length within the code. 


Proof: There are finitely many codes satisfying the Kraft inequality with equality,!! so consider 
a particular one that is optimal. If py < p; for each 7 < M, then, from Lemma 2.5.1, ly > 1; 
for each and /jy has maximal length. If pyy = p; for one or more 7 < M, then 1; must be 
maximal for at least one such j. Then if Jj is not maximal, C(j) and C(M) can be interchanged 
with no loss of optimality, after which ly is maximal. Now if C(k) is the sibling of C(/) in this 
optimal code, then J, also has maximal length. By the argument above, C(M — 1) can then be 
exchanged with C(k) with no loss of optimality. 


The Huffman algorithm chooses an optimal code tree by starting with the two least likely 
symbols, specifically M and M — 1, and constraining them to be siblings in the yet unknown 
code tree. It makes no difference which sibling ends in 1 and which in 0. How is the rest of the 
tree to be chosen? 


If the above pair of siblings is removed from the yet unknown tree, the rest of the tree must 
contain M — 1 leaves, namely the M — 2 leaves for the original first M — 2 symbols, and the 
parent node of the removed siblings. The probability p),_, associated with this new leaf is taken 
as py—1+pm. This tree of M — 1 leaves is viewed as a code for a reduced random symbol X’ 
with a reduced set of probabilities given as p1,... ,pjg—2 for the original first MZ — 2 symbols 
and p,_, for the new symbol M — 1. 


To complete the algorithm, an optimal code is constructed for X’. It will be shown that an 
optimal code for X can be generated by constructing an optimal code for X’, and then grafting 
siblings onto the leaf corresponding to symbol M — 1. Assuming this fact for the moment, the 
problem of constructing an optimal M-ary code has been replaced with constructing an optimal 
M-—1-ary code. This can be further reduced by applying the same procedure to the M—1-ary 
random symbol, and so forth down to a binary symbol for which the optimal code is obvious. 


The following example in Figures 2.9 to 2.11 will make the entire procedure obvious. It starts 
with a random symbol X with probabilities {0.4,0.2,0.15,0.15,0.1} and generates the reduced 
random symbol X’ in Figure 2.9. The subsequent reductions are shown in Figures 2.10 and 2.11. 


Pj symbol 
0.4 1 


The two least likely symbols, 4 and 
5 have been combined as siblings. 
The reduced set of probabilities 
then becomes {0.4, 0.2, 0.15, 0.25}. 


0.2 
0.15 


2 

3 

1 0.15 4 
ey 

0.1 5 


Figure 2.9: Step 1 of the Huffman algorithm; finding X’ from X 


Another example using a different set of probabilities and leading to a different set of codeword 
lengths is given in Figure 2.12: 


‘Exercise 2.10 proves this for those who enjoy such things. 
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Pj symbol 
0.4 1 The two least likely symbols in the 
1 0.2 2 reduced set, with probabilities 
(0.35 0.15 and 0.2, have been combined as 
0.15 3 siblings. The reduced set of proba- 
(0.25 1 is bilities then becomes {0.4, 0.35, 0.25}. 
; 0 
=. O. 


Figure 2.10: Finding X” from X’. 


Pj; symbol codeword 


0.4 1 1 

0.2 2 O11 
0.15 3 010 
0.15 4 001 
0.1 5 000 


Figure 2.11: The completed Huffman code. 


The only thing remaining to show that the Huffman algorithm constructs optimal codes is to 
show that an optimal code for the reduced random symbol X’ yields an optimal code for X. 
Consider Figure 2.13, which shows the code tree for X’ corresponding to X in Figure 2.12. 


Note that Figures 2.12 and 2.13 differ in that C(4) and C(5), each of length 3 in Figure 2.12, 
have been replaced by a single codeword of length 2 in Figure 2.13. The probability of that 
single symbol is the sum of the two probabilities in Figure 2.12. Thus the expected codeword 
length for Figure 2.12 is that for Figure 2.13, increased by p4 + ps5. This accounts for the fact 
that C(4) and C(5) have lengths one greater than their parent node. 


In general, comparing the expected length L’ of any code for X’ and the corresponding L of the 
code generated by extending C’(M — 1) in the code for X’ into two siblings for M—1 and M, 
it is seen that 


ee + puM—-1+pM. 


This relationship holds for all codes for X in which C(M — 1) and C(M) are siblings (which 
includes at least one optimal code). This proves that L is minimized by minimizing jhe and 
also shows that Zmin = L',,, +pu—1+pm. This completes the proof of the optimality of the 
Huffman algorithm. 


It is curious that neither the Huffman algorithm nor its proof of optimality give any indication 
of the entropy bounds, H[X] < Lmin < H[X] +1. Similarly, the entropy bounds do not suggest 
the Huffman algorithm. One is useful in finding an optimal code; the other provides insightful 
performance bounds. 
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Pj symbol codeword 


0.35 1 11 
1 0 
(04) 0.2 2 01 
0 0.2 3 00 
1 0.15 4 101 
0 
(0.25) 0.1 5 100 


Figure 2.12: Completed Huffman code for a different set of probabilities. 


Pj; symbol codeword 


(0.6) 1 03h ho 

1 0 
Oe 0.2 2 OL 
0 0.2 300 
0.25 4 10 


Figure 2.13: Completed reduced Huffman code for Figure 2.12. 


As an example of the extent to which the optimal lengths approximate —logp,;, the source 
probabilities in Figure 2.11 are {0.40,0.20,0.15,0.15,0.10}, so —logp; takes the set of values 
{1.32, 2.32, 2.74, 2.74, 3.32} bits; this approximates the lengths {1,3,3,3,3} of the optimal code 
quite well. Similarly, the entropy is HLX] = 2.15 bits/symbol and [min = 2.2 bits/symbol, quite 
close to H[X]. However, it would be difficult to guess these optimal lengths, even in such a 
simple case, without the algorithm. 


For the example of Figure 2.12, the source probabilities are {0.35,0.20,0.20,0.15,0.10}, the 
values of — log p; are {1.51, 2.32, 2.32, 2.74, 3.32}, and the entropy is H[X] = 2.20. This is not 
very different from Figure 2.11. However, the Huffman code now has lengths {2, 2, 2,3,3} and 
average length L = 2.25 bits/symbol. (The code of Figure 2.11 has average length L = 2.30 for 
these source probabilities.) It would be hard to predict these perturbations without carrying 
out the algorithm. 


2.6 Entropy and fixed-to-variable-length codes 


Entropy is now studied in more detail, both to better understand the entropy bounds and to 
understand the entropy of n-tuples of successive source letters. 


The entropy H[X] is a fundamental measure of the randomness of a random symbol X. It has 
many important properties. The property of greatest interest here is that it is the smallest 
expected number L of bits per source symbol required to map the sequence of source symbols 
into a bit sequence in a uniquely decodable way. This will soon be demonstrated by generalizing 
the variable-length codes of the last few sections to codes in which multiple source symbols are 
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encoded together. First, however, several other properties of entropy are derived. 


Definition: The entropy of a discrete random atl X with alphabet ¥ is 


=) px(z ea =-—S° px(2) log px (2). (2.11) 
LEX LEX 
Using logarithms to the base 2, the units of HX] are bits/symbol. If the base of the logarithm 
is e, then the units of H[X] are called nats/symbol. Conversion is easy; just remember that 
logy = (Iny)/(In2) or Iny = (logy)/(loge), both of which follow from y = e™¥ = 2!°8Y by 
taking logarithms. Thus using another base for the logarithm just changes the numerical units 
of entropy by a scale factor. 
Note that the entropy H[X] of a discrete random symbol X depends on the probabilities of the 
different outcomes of X, but not on the names of the outcomes. Thus, for example, the entropy 
of a random symbol taking the values GREEN, BLUE, and RED with probabilities 0.2, 0.3, 0.5, 
respectively, is the same as the entropy of a random symbol taking on the values SUNDAY, 
MonbDay, FRIDAY with the same probabilities 0.2, 0.3, 0.5. 
The entropy H[X] is also called the uncertainty of X, meaning that it is a measure of the 
randomness of X. Note that entropy is the expected value of the rv log(1/px(X)). This 
random variable is called the log pmf rv.'°> Thus the entropy is the expected value of the log 
pmf rv. 


Some properties of entropy: 


e For any discrete random symbol X, H[X] > 0. This follows because px(x) < 1, so 
log(1/px(a)) > 0. The result follows from (2.11). 


e H[X] = 0 if and only if X is deterministic. This follows since px (x) log(1/px(x)) = 0 if and 
only if px(x) equals 0 or 1. 


e The entropy of an equiprobable random symbol X with an alphabet 4 of size M is H[X] = 
log M. This follows because, if px(x) = 1/M for all x € ¥, then 


1 
H[X] = 5 77 0g M = log M. 
LEX 
In this case, the rv — log px(X) has the constant value log M. 
e More generally, the entropy H[X] of a random symbol X defined on an alphabet %¥ of size 


M satisfies HLX] < log M, with equality only in the equiprobable case. To see this, note 
that 


HX] log = > px(a) [low oy — lee | = Do pxto) [low a75 aay 


LEX LEX 


IA 


(log e) Spx (a reas = 1] =0, 


LEX 


"Tf one wishes to consider discrete random symbols with one or more symbols of zero probability, one can still 
use this formula by recognizing that lim,—o plog(1/p) = 0 and then defining 0 log 1/0 as 0 in (2.11). Exercise 2.18 
illustrates the effect of zero probability symbols in a variable-length prefix code. 

This rv is often called self information or surprise, or uncertainty. It bears some resemblance to the ordinary 
meaning of these terms, but historically this has caused much more confusion than enlightenment. Log pmf, on 
the other hand, emphasizes what is useful here 
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This uses the inequality log u < (log e)(u—1) (after omitting any terms for which px(x) = 0). 
For equality, it is necessary that px(x) = 1/M for alla ec X. 


In summary, of all random symbols X defined on a given finite alphabet ¥V, the highest entropy 
occurs in the equiprobable case, namely H[X] = log M, and the lowest occurs in the deterministic 
case, namely H[X] = 0. This supports the intuition that the entropy of a random symbol X is 
a measure of its randomness. 


For any pair of discrete random symbols X and Y, XY is another random symbol. The sample 
values of XY are the set of all pairs ry, x € Y,y € Y and the probability of each sample value 
xy is pxy(x,y). An important property of entropy is that if X and Y are independent discrete 
random symbols, then HLXY] = H[X]+ H[Y]. This follows from: 


H[XY] = -— 5° pxy(a,y) log pxy (a, y) 
Ay. 
= — > px(«)py(y) (log px (x) + log py (y)) = H[X] + H[¥]. (2.12) 
xxy 


Extending this to n random symbols, the entropy of a random symbol X” corresponding to a 
block of n iid outputs from a discrete memoryless source is H[X"] = nH[X]; i.e., each symbol 
increments the entropy of the block by H[X] bits. 


2.6.1 Fixed-to-variable-length codes 


Recall that in Section 2.2 the sequence of symbols from the source was segmented into successive 
blocks of n symbols which were then encoded. Each such block was a discrete random symbol 
in its own right, and thus could be encoded as in the single-symbol case. It was seen that by 
making n large, fixed-length codes could be constructed in which the number L of encoded bits 
per source symbol approached log M as closely as desired. 


The same approach is now taken for variable-length coding of discrete memoryless sources. A 
block of n source symbols, X1, X2,...,Xn has entropy H[X"] = nH[X]. Such a block is a 
random symbol in its own right and can be encoded using a variable-length prefix-free code. 
This provides a fixed-to-variable-length code, mapping n-tuples of source symbols to variable- 
length binary sequences. It will be shown that the expected number L of encoded bits per source 
symbol can be made as close to H[X] as desired. 


Surprisingly, this result is very simple. Let E[L(X™")] be the expected length of a variable-length 
prefix-free code for X”. Denote the minimum expected length of any prefix-free code for X” 
by E[Z(X")]min. Theorem 2.5.1 then applies. Using (2.7), 


H[X™] < E[LL(X"))min < H[X"] +1. (213) 


Define Desi aS = FE min. i.e. fe bene n is the minimum number of bits per source symbol over 


all prefix-free codes for X”. Fico (2. 13), 


H[X] < Linn < H[X] + oy (2.14) 
n 


This simple result establishes the following important theorem: 
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Theorem 2.6.1 (Prefix-free source coding theorem). For any discrete memoryless source 
with entropy H[X], and any integer n > 1, there exists a prefir-free encoding of source n-tuples for 
which the expected codeword length per source symbol L is at most H[X]+1/n. Furthermore, no 
prefixz-free encoding of fixed-length source blocks of any length n results in an expected codeword 
length L less than H[X]. 


This theorem gives considerable significance to the entropy H[X] of a discrete memoryless source: 
H[X] is the minimum expected number L of bits per source symbol that can be achieved by 
fixed-to-variable-length prefix-free codes. 


There are two potential questions about the significance of the theorem. First, is it possible 
to find uniquely-decodable codes other than prefix-free codes for which L is less than H[X]? 
Second, is it possible to further reduce L by using variable-to-variable-length codes? 


For example, if a binary source has py = 107° and po = 1 — 107°, fixed-to-variable-length 
codes must use remarkably long n-tuples of source symbols to approach the entropy bound. 
Run-length coding, which is an example of variable-to-variable-length coding, is a more sensible 
approach in this case: the source is first encoded into a sequence representing the number of 
source 0’s between each 1, and then this sequence of integers is encoded. This coding technique 
is further developed in Exercise 2.23. 


The next section strengthens Theorem 2.6.1, showing that H[X] is indeed a lower bound to L 
over all uniquely-decodable encoding techniques. 


2.7 The AEP and the source coding theorems 


We first review the weak’ law of large numbers (WLLN) for sequences of iid rv’s. Applying 
the WLLN to a particular iid sequence, we will establish a form of the remarkable asymptotic 
equipartition property (AEP). 

Crudely, the AEP says that, given a very long string of n iid discrete random symbols 
X1,...,Xn, there exists a “typical set” of sample strings (a1,... , 2) whose aggregate probabil- 
ity is almost 1. There are roughly 2”4|*! typical strings of length n, and each has a probability 
roughly equal to 2~"4IX!, We will have to be careful about what the words “almost” and 
“roughly” mean here. 


The AEP will give us a fundamental understanding not only of source coding for discrete memo- 
ryless sources, but also of the probabilistic structure of such sources and the meaning of entropy. 
The AEP will show us why general types of source encoders, such as variable-to-variable-length 
encoders, cannot have a strictly smaller expected length per source symbol than the best fixed- 
to-variable-length prefix-free codes for discrete memoryless sources. 


“The word weak is something of a misnomer, since this is one of the most useful results in probability theory. 
There is also a strong law of large numbers; the difference lies in the limiting behavior of an infinite sequence of 
rv’s, but this difference is not relevant here. The weak law applies in some cases where the strong law does not, 
but this also is not relevant here. 
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2.7.1 The weak law of large numbers 


Let Yi, Y2,... , be a sequence of iid rv’s. Let Y and ot. be the mean and variance of each Y;. 
Define the sample average AY. of Yi,... , Yn as 


nr 


S 
Aye where Sy =Y¥it---+Yn. 


The sample average A‘: is itself an rv, whereas, of course, the mean Y is simply a real number. 
Since the sum S? has mean nY and variance no?., the sample average A? has mean E[A?.] = Y 
and variance oAn = o3n /n? = o%-/n. It is important to understand that the variance of the sum 


increases with n and the variance of the normalized sum (the sample average, Aj-), decreases 
with n. 


The Chebyshev inequality states that if 0% < oo for an rv X, then, Pr{|X — X| > e} < o}/e? 
for any € > 0 (see Exercise 2.3 or any text on probability such as [2] or [24]). Applying this 
inequality to A}. yields the simplest form of the WLLN: for any ¢ > 0, 


2 

— oO 
Pr{|A™ —Y| > eh < —%. 2.15 
r{| | Zeya (2.15) 


This is illustrated in Figure 2.14. 


Pr(|A?"—Y]| < e) 


Pr(|A%—-Y| <e) 


| 
Y-e Y Y-e 

Figure 2.14: Sketch of the distribution function of the sample average for different n. 

As n increases, the distribution function approaches a unit step at Y. The closeness to 

a step within Y + ¢ is upperbounded by (2.15). 


y 


Since the right side of (2.15) approaches 0 with increasing n for any fixed ¢ > 0, 


lim Pr{|A} — Y| > ce} =0. (2.16) 


For large n, (2.16) says that Af —Y is small with high probability. It does not say that A? = Y 
with high probability (or even nonzero probability), and it does not say that Pr(|AY — Y| > 
€) = 0. As illustrated in Figure 2.14, both a nonzero € and a nonzero probability are required 
here, even though they can be made simultaneously as small as desired by increasing n. 


In summary, the sample average Ai is an rv whose mean Y is independent of n, but whose 
standard deviation oy /./n approaches 0 as n — oo. Therefore the distribution of the sample 
average becomes concentrated near Y as n increases. The WLLN is simply this concentration 
property, stated more precisely by either (2.15) or (2.16). 
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The WLLN, in the form of (2.16), applies much more generally than the simple case of iid rv’s. 
In fact, (2.16) provides the central link between probability models and the real phenomena 
being modeled. One can observe the outcomes both for the model and reality, but probabilities 
are assigned only for the model. The WLLN, applied to a sequence of rv’s in the model, and 
the concentration property (if it exists), applied to the corresponding real phenomenon, provide 
the basic check on whether the model corresponds reasonably to reality. 


2.7.2. The asymptotic equipartition property 


This section starts with a sequence of iid random symbols and defines a sequence of random 
variables (rv’s) as functions of those symbols. The WLLN, applied to these rv’s, will permit 
the classification of sample sequences of symbols as being ‘typical’ or not, and then lead to the 
results alluded to earlier. 


Let X1, X2,... be a sequence of iid discrete random symbols with a common pmf px(x)>0, rE. 
For each symbol x in the alphabet %¥, let w(x) = —logpx(a). For each Xx in the sequence, 
define W(X;,) to be the rv that takes the value w(x) for X, = x. Then W(X1),W(X2),... isa 
sequence of iid discrete rv’s, each with mean 


E[W(Xx)] = — 5° px(a) log px (a) = H[X], (2.17) 
LEX 
where H[X] is the entropy of the random symbol X. 
The rv W(X;x) is the log pmf of Xz and the entropy of X; is the mean of W(X;). 
The most important property of the log pmf for iid random symbols comes from observing, for 
example, that for the event X; = x1, X2 = 22, the outcome for W(X1) + W(X) is 
w(x1) + w(%2) = — log px (x1) — log px (x2) = — log{px, x, (4122) }- (2.18) 


In other words, the joint pmf for independent random symbols is the product of the individual 
pmf’s, and therefore the log of the joint pmf is the sum of the logs of the individual pmf’s. 


We can generalize (2.18) to a string of n random symbols, X” = (X1,...,Xn). For an event 
X” = 2g” where x” = (21,...,2n), the outcome for the sum W(X) +---+ W(X,) is 
n n 
Sees w(k) = — ee log px (tz) = — log pxn(x"). (2.19) 


The WLLN can now be applied to the sample average of the log pmfs. Let 


nm nm 


be the sample average of the log pmf. 
From (2.15), it follows that 


Pr ( |At, — E[W(X)] | > é) Zw, (2.21) 
Substituting (2.17) and (2.20) into (2.21), 


pr ( | were) Hix] > :) Peele) (2.22) 
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In order to interpret this result, define the typical set T! for any ¢ > 0 as 


T? = a Je HEX] <e}. (2.23) 


. n 


Thus 7?" is the set of source strings of length n for which the sample average of the log pmf is 
within ¢ of its mean H[X]. Eq. (2.22) then states that the aggregrate probability of all strings 
of length n not in T? is at most o7,/(ne*). Thus, 


2 
Oo 
PHA" eT) > 1 =” 


ae (2.24) 


As n increases, the aggregate probability of 72’ approaches 1 for any given € > 0, so JT?" is 
certainly a typical set of source strings. This is illustrated in Figure 2.15. 


WwW 


H-e H H+te 


Figure 2.15: Sketch of the distribution function of the sample average log pmf. As n 
increases, the distribution function approaches a unit step at H. The typical set is the 
set of sample strings of length n for which the sample average log pmf stays within ¢ 
of H; as illustrated, its probability approaches 1 as n — oo. 


Rewrite (2.23) in the form 
Le = {2" : n(H[X] — €) < —logpxn(a”) < n(H[X] +e} 
Multiplying by —1 and exponentiating, 


T” = {2" » Q-rHIXI +) < nyn(a”) < nino} (2.25) 
Eq. (2.25) has the intuitive connotation that the n-strings in T!” are approximately equiprobable. 
This is the same kind of approximation that one would use in saying that 107109! ~ 1971000, 
these numbers differ by a factor of 10, but for such small numbers it makes sense to compare the 
exponents rather than the numbers themselves. In the same way, the ratio of the upper to lower 
bound in (2.25) is 27°", which grows unboundedly with n for fixed ¢. However, as seen in (2.23), 
—i log pxn(x”) is approximately equal to H[X] for all #” € T:’. This is the important notion, 
and it does no harm to think of the n-strings in T!’ as being approximately equiprobable. 


The set of all n-strings of source symbols is thus separated into the typical set T!’ and the 
complementary atypical set (T/’)°. The atypical set has aggregate probability no greater than 
o7,/(ne*), and the elements of the typical set are approximately equiprobable (in this peculiar 
sense), each with probability about oily) 
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The typical set 7” depends on the choice of ¢. As e decreases, the equiprobable approximation 
(2.25) becomes tighter, but the bound (2.24) on the probability of the typical set is further 
from 1. As n increases, however, € can be slowly decreased, thus bringing the probability of the 
typical set closer to 1 and simultaneously tightening the bounds on equiprobable strings. 

Let us now estimate the number of elements |T”| in the typical set. Since px» (#”) > 2-”(HIX]+¢) 
for each z” € TY”, 


1S pxda) > 2 
ereTp 


This implies that |T”| < 2”(4I%l+*), In other words, since each 2” € T” contributes at least 


2-(HIX]+2) to the probability of T ”, the number of these contributions can be no greater than 
gn(H[X]+e) | 


Conversely, since Pr(T”) > 1 — of, /(ne?), || can be lower bounded by 


2 
ow —n(H[X]—e) 
ae 2 Ge dS eal 
arelp 


which implies |T7| > [1 — o7,/(ne?)]2"41X1-®), In summary, 
2 
(1 =: “1 ) gn(H[X]—e) - \T| < gn(H[X]+e) (2.26) 
ne? © 


For large n, then, the typical set 72’ has aggregate probability approximately 1 and contains 
approximately 2”4I*] elements, each of which has probability approximately 2~"414], That is, 
asymptotically for very large n, the random symbol X” resembles an equiprobable source with 
alphabet size 2”4[*], 


The quantity o7,/(ne?) in many of the equations above is simply a particular upper bound to 
the probability of the atypical set. It becomes arbitrarily small as n increases for any fixed 
€ > 0. Thus it is insightful to simply replace this quantity with a real number 6; for any such 
6 >0 and any ¢ > 0, o7,/(ne?) < 6 for large enough n. 


This set of results, summarized in the following theorem, is known as the asymptotic equipartition 
property (AEP). 


Theorem 2.7.1 (Asymptotic equipartition property). Let X” be a string of n iid discrete 
random symbols {Xz;1<k <n} each with entropy H[X]. For alld > 0 and all sufficiently large 
n, Pr(T!) > 1-6 and |T?| is bounded by 


Pog) (7). ote) 557 
E 


Finally, note that the total number of different strings of length n from a source with alphabet 
size M is M”. For non-equiprobable sources, namely sources with H[X] < log M, the ratio of 
the number of typical strings to total strings is approximately 2~"(l¢s “@—-HI¥}) | which approaches 
0 exponentially with n. Thus, for large n, the great majority of n-strings are atypical. It may 
be somewhat surprising that this great majority counts for so little in probabilistic terms. As 
shown in Exercise 2.26, the most probable of the individual sequences are also atypical. There 
are too few of them, however, to have any significance. 


We next consider source coding in the light of the AEP. 
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2.7.3 Source coding theorems 


Motivated by the AEP, we can take the approach that an encoder operating on strings of n source 
symbols need only provide a codeword for each string x” in the typical set T!’. If a sequence 
x” occurs that is not in 7", then a source coding failure is declared. Since the probability of 
xz” ¢ T” can be made arbitrarily small by choosing n large enough, this situation is tolerable. 


In this approach, since there are less than 2”(4I4}+©) strings of length n in T!, the number 


of source codewords that need to be provided is fewer than 2”(4l4]+©), Choosing fixed-length 
codewords of length [n(H[X]+ ¢)] is more than sufficient and even allows for an extra codeword, 
if desired, to indicate that a coding failure has occurred. In bits per source symbol, taking the 
ceiling function into account, LD < H[X]+¢+1/n. Note that ¢ > 0 is arbitrary, and for any such 
é, Pr{failure} — 0 as n — oo. This proves the following theorem: 


Theorem 2.7.2 (Fixed-to-fixed-length source coding theorem). For any discrete mem- 
oryless source with entropy HX], any <> 0, any 6 > 0, and any sufficiently large n, there is a 
fixed-to-fired-length source code with Pr{failure} < 6 that maps blocks of n source symbols into 
fixed-length codewords of length L < H{X|+¢+1/n bits per source symbol. 


We saw in section 2.2 that the use of fixed-to-fixed-length source coding requires log M bits per 
source symbol if unique decodability is required (i.e., no failures are allowed), and now we see 
that this is reduced to arbitrarily little more than H[X] bits per source symbol if arbitrarily rare 
failures are allowed. This is a good example of a situation where ‘arbitrarily small 6 > 0’ and 0 
behave very differently. 


There is also a converse to this theorem following from the other side of the AEP theorem. This 
says that the error probability approaches 1 for large n if strictly fewer than H[X] bits per source 
symbol are provided. 


Theorem 2.7.3 (Converse for fixed-to-fixed-length codes). Let X” be a string of n iid 
discrete random symbols {X,z;1 < k < n}, with entropy H|X] each. For any v > 0, let X” be 
encoded into fixed-length codewords of length |n(H[X|—v)| bits. For every 6 > 0 and for all 
sufficiently large n given 6, 


Pr{failure} > 1-6 —27-™/2, (2.28) 


Proof: Apply the AEP, Theorem 2.7.1, with « = v/2. Codewords can be provided for at 
most 2”(4HIX]—v) typical source n-sequences, and from (2.25) each of these has a probability at 
most 2~"(HIX]-v/2) Thus the aggregate probability of typical sequences for which codewords 
are provided is at most 2~"”/?, From the AEP theorem, Pr{7”} > 1 — 6 is satisfied for large 
enough n. Codewords!’ can be provided for at most a subset of T? of probability 2~"” /2 and 
the remaining elements of T’ must all lead to errors, thus yielding (2.28). 


In going from fixed-length codes of slightly more than H[X] bits per source symbol to codes of 
slightly less than H[X] bits per source symbol, the probability of failure goes from almost 0 to 
almost 1, and as n increases, those limits are approached more and more closely. 


™Note that the proof allows codewords to be provided for atypical sequences; it simply says that a large portion 
of the typical set cannot be encoded. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


42 CHAPTER 2. CODING FOR DISCRETE SOURCES 


2.7.4 The entropy bound for general classes of codes 


We have seen that the expected number of encoded bits per source symbol is lower bounded 
by H[X] for iid sources using either fixed-to-fixed-length or fixed-to-variable-length codes. The 
details differ in the sense that very improbable sequences are simply dropped in fixed-length 
schemes but have abnormally long encodings, leading to buffer overflows, in variable-length 
schemes. 


We now show that other types of codes, such as variable-to-fixed, variable-to-variable, and even 
more general codes are also subject to the entropy limit. Rather than describing the highly varied 
possible nature of these source codes, this will be shown by simply defining certain properties 
that the associated decoders must have. By doing this, it is also shown that yet undiscovered 
coding schemes must also be subject to the same limits. The fixed-to-fixed-length converse in 
the last subsection is the key to this. 


For any encoder, there must be a decoder that maps the encoded bit sequence back into the 
source symbol sequence. For prefix-free codes on k-tuples of source symbols, the decoder waits 
for each variable length codeword to arrive, maps it into the corresponding k-tuple of source 
symbols, and then starts decoding for the next k-tuple. For fixed-to-fixed-length schemes, the 
decoder waits for a block of code symbols and then decodes the corresponding block of source 
symbols. 


In general, the source produces a non-ending sequence X 1, X2,... of source letters which are 
encoded into a non-ending sequence of encoded binary digits. The decoder observes this encoded 
sequence and decodes source symbol X, when enough bits have arrived to make a decision on 
it. 

For any given coding and decoding scheme for a given iid source, define the rv Dy, as the number 
of received bits that permit a decision on X”" = Xj,...,Xy. This includes the possibility of 
coders and decoders for which some sample source strings are decodedeincorrectly or postponed 
infinitely. For these x”, the sample value of D, is taken to be infinite. It is assumed, however, 
that all decisions are final in the sense that the decoder cannot decide on a particular 2” after 
observing an initial string of the encoded sequence and then change that decision after observing 
more of the encoded sequence. What we would like is a scheme in which decoding is correct 
with high probability and the sample value of the rate, D,,/n, is small with high probability. 
What the following theorem shows is that for large n, the sample rate can be strictly below the 
entropy only with vanishingly small probability. This then shows that the entropy lower bounds 
the data rate in this strong sense. 


Theorem 2.7.4 (Converse for general coders/decoders for iid sources). Let X°° be a 
sequence of discrete random symbols {Xx;1 <k < co}. For each integer n > 1, let X” be the 
first n of those symbols. For any given encoder and decoder, let Dy, be the number of received 
bits at which the decoder can correctly decode X”. Then for any v > 0 and 6 > 0, and for any 
sufficiently large n given v and 6, 


Pr{Dn < n[HLX] — v]} <6 +27¢"/2, (2.29) 


Proof: For any sample value «°° of the source sequence, let y® denote the encoded sequence. 
For any given integer n > 1, let m = |n[H[X]—v]|. Suppose that x” is decoded upon observation 
of y! for some 7 < m. Since decisions are final, there is only one source n-string, namely x”, 
that can be decoded by time y™ is observed. This means that out of the 2” possible initial 
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m-strings from the encoder, there can be at most!® 2” n-strings from the source that be decoded 
from the observation of the first m encoded outputs. The aggregate probability of any set of 2” 
source n-strings is bounded in Theorem 2.7.3, and (2.29) simply repeats that bound. 


2.8 Markov sources 


The basic coding results for discrete memoryless sources have now been derived. Many of the 
results, in particular the Kraft inequality, the entropy bounds on expected length for uniquely- 
decodable codes, and the Huffman algorithm, do not depend on the independence of successive 
source symbols. 


In this section, these results are extended to sources defined in terms of finite-state Markov 
chains. The state of the Markov chain!” is used to represent the “memory” of the source. 
Labels on the transitions between states are used to represent the next symbol out of the source. 
Thus, for example, the state could be the previous symbol from the source, or could be the 
previous 300 symbols. It is possible to model as much memory as desired while staying in the 
regime of finite-state Markov chains. 


Example 2.8.1. Consider a binary source with outputs X71, Xo,...  . Assume that the symbol 
probabilities for X,, are conditioned on X;_2 and X;,_; but are independent of all previous 
symbols given these past 2 symbols. This pair of previous symbols is modeled by a state S;_1. 
The alphabet of possible states is then the set of binary pairs, S = {[00], [01], [10], [11]}. In 
Figure 2.16, the states are represented as the nodes of the graph representing the Markov chain, 
and the source outputs are labels on the graph transitions. Note, for example, that from the state 
Sp—1 = [01] (representing X,_2=0, X,-1=1), the output X;,=1 causes a transition to S; = [11] 
(representing X,_,;=1, X,=1). The chain is assumed to start at time 0 in a state So given by 
some arbitrary pmf. 


Figure 2.16: Markov source: Each transition s’ — s is labeled by the corresponding 
source output and the transition probability Pr{S;, = s|S;,_1 = s’}. 


Note that this particular source is characterized by long strings of zeros and long strings of ones 
interspersed with short transition regions. For example, starting in state 00, a representative 


'6There are two reasons why the number of decoded n-strings of source symbols by time m can be less than 2”. 
The first is that the first n source symbols might not be decodable until after the mth encoded bit is received. 
The second is that multiple m-strings of encoded bits might lead to decoded strings with the same first n source 
symbols. 

'TThe basic results about finite-state Markov chains, including those used here, are established in many texts 
such as [8] and [25] . These results are important in the further study of digital communcation, but are not 
essential here. 
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output would be 
00000000101111111111111011111111010100000000 - - - 
Note that if s, = [x,_12,] then the next state must be either s,41 = [x,0] or sp41 = [rel]; 7-e., 
each state has only two transitions coming out of it. 
The above example is now generalized to an arbitrary discrete Markov source. 


Definition 2.8.1. A finite-state Markov chain is a sequence So, 51,... of discrete random sym- 
bols from a finite alphabet, S. There is a pmf go(s),s € S on So, and there is a conditional pmf 
Q(s|s’) such that for all m > 1, all s € S, and all s’ € S, 


Pr($;,=s| S,_1=8') = Pr(S,=8| Sp_1=8',--- , So=S0) = Q(s| 8’). (2.30) 
There is said to be a transition from s’ to s, denoted s’ — s, if Q(s|s’) > 0. 


Note that (2.30) says, first, that the conditional probability of a state, given the past, depends 
only on the previous state, and second, that these transition probabilities Q(s|s’) do not change 


with time. 
Definition 2.8.2. A Markov source is a sequence of discrete random symbols %, %2,... witha 
common alphabet ¥ which is based on a finite-state Markov chain So, $ ,.... Each transition 


(s’ — s) in the Markov chain is labeled with a symbol from ¥; each symbol from 4 can appear 
on at most one outgoing transition from each state. 


Note that the state alphabet S and the source alphabet ¥ are in general different. Since 
each source symbol appears on at most one transition from each state, the initial state So=so, 
combined with the source output, X;=2), X2=72,... , uniquely identifies the state sequence, 
and, of course, the state sequence uniquely specifies the source output sequence. If 7 € ¥ labels 
the transition s’ — s, then the conditional probability of that x is given by P(2| s’) = Q(s| 8’). 
Thus, for example, in the transition [00] — [0]1 in Figure 2.16, Q([01]| [00]) = P(1| [00)). 


The reason for distinguishing the Markov chain alphabet from the source output alphabet is to 
allow the state to represent an arbitrary combination of past events rather than just the previous 
source output. It is this feature that permits Markov source models to reasonably model both 
simple and complex forms of memory. 


A state s is accessible from state s’ in a Markov chain if there is a path in the corresponding 
graph from s’ — s, i.e., if Pr(S,=s| Sp=s’) > 0 for some k > 0. The period of a state s is 
the greatest common divisor of the set of integers k > 1 for which Pr(S,=s|So=s) > 0. A 
finite-state Markov chain is ergodic if all states are accessible from all other states and if all 
states are aperiodic, i.e., have period 1. 


We will consider only Markov sources for which the Markov chain is ergodic. An important fact 
about ergodic Markov chains is that the chain has steady-state probabilities q(s) for all s € S, 
given by the unique solution to the linear equations 


ds) = YoalsQ(s|s); sES (2.31) 


s'ES 


s Gs) <= TH. 


sES 
These steady-state probabilities are approached asymptotically from any starting state, 7.e., 


jim Pr(S,=s| So=s’) = q(s) for all s,s’ ES. (2.32) 
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2.8.1 Coding for Markov sources 


The simplest approach to coding for Markov sources is that of using a separate prefix-free code 
for each state in the underlying Markov chain. That is, for each s € S, select a prefix-free 
code whose lengths /(x,s) are appropriate for the conditional pmf P(2|s) > 0. The codeword 
lengths for the code used in state s must of course satisfy the Kraft inequality )°,. ig eae 
The minimum expected length, Lmin(s) for each such code can be generated by the Huffman 
algorithm and satisfies 

H[X|s] < Lmin(s) < H[X|s] +1. (2.33) 


where, for each s € S, H[X|s] = )0,,—P(a| s) log P(a| s). 


If the initial state So is chosen according to the steady-state pmf {q(s);s € S}, then, from (2.31), 
the Markov chain remains in steady state and the overall expected codeword length is given by 


H[X|.$] < Lmin < H[X] S$] +1, (2.34) 
where 
Lmin = s q(s) Lmin(s) and (2.35) 
sES 
H[X|S] = 0 q(s)H[X| 5] (2.36) 


Assume that the encoder transmits the initial state sp at time 0. If M’ is the number of elements 
in the state space, then this can be done with [log M’] bits, but this can be ignored since it is 
done only at the beginning of transmission and does not affect the long term expected number 
of bits per source symbol. The encoder then successively encodes each source symbol x, using 
the code for the state at time k —1. The decoder, after decoding the initial state so, can decode 
x1 using the code based on state so. After determining s; from sg and 2, the decoder can 
decode x2 using the code based on s;. The decoder can continue decoding each source symbol, 
and thus the overall code is uniquely decodable. We next must understand the meaning of the 
conditional entropy in (2.36). 


2.8.2 Conditional entropy 


It turns out that the conditional entropy H[X|S] plays the same role in coding for Markov 
sources as the ordinary entropy H[X] plays for the memoryless case. Rewriting (2.36), 


HX] 5] = 32 So als (el Jos ae. 
SES LEX 


This is the expected value of the rv log[1/P(X| S)]. 


An important entropy relation, for arbitrary discrete rv’s, is 


H[LXS] = H[S] + H[X| SI. (2.37) 
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To see this, 


H[XS] = 7 4(s)P(2|s) log re 


So a(s)P (z| #) OE ay ) at Lae (el )los a5 


Sx 


= HS] +H[X|S]. 


Exercise 2.19 demonstrates that 
H[XS] < H[S] + H[X] 


Comparing this and (2.37), it follows that 
H[X |S] < H[X]. (2.38) 


This is an important inequality in information theory. If the entropy H[LX] as a measure of mean 
uncertainty, then the conditional entropy H[X| S] should be viewed as a measure of mean uncer- 
tainty after the observation of the outcome of S. If X and S are not statistically independent, 
then intuition suggests that the observation of S should reduce the mean uncertainty in X; this 
equation indeed verifies this. 


Example 2.8.2. Consider Figure 2.16 again. It is clear from symmetry that, in steady state, 
px (0) = px(1) = 1/2. Thus H[X] = 1 bit. Conditional on S=00, X is binary with pmf {0.1, 
0.9}, so H[X] [00]] = —0.1log0.1 — 0.9log0.9 = 0.47 bits. Similarly, H[X|[11]] = 0.47 bits, 
and H[X|[01]} = H[X|[10]] = 1 bit. The solution to the steady-state equations in (2.31) is 
q([00]) = g([11]) = 5/12 and gq([01]) = g({10]) = 1/12. Thus, the conditional entropy, averaged 
over the states, is H[X| S] = 0.558 bits. 


For this example, it is particularly silly to use a different prefix-free code for the source output 
for each prior state. The problem is that the source is binary, and thus the prefix-free code will 
have length 1 for each symbol no matter what the state. As with the memoryless case, however, 
the use of fixed-to-variable-length codes is a solution to these problems of small alphabet sizes 
and integer constraints on codeword lengths. 


Let E[L(X")|min,s be the minimum expected length of a prefix-free code for X” conditional on 
starting in state s. Then, applying (2.13) to the situation here, 


H[X” | s] < E[L(X")|min,s < H[X” | s] +1. 


Assume as before that the Markov chain starts in steady state Sg. Thus it remains in steady 
state at each future time. Furthermore assume that the initial sample state is known at the 
decoder. Then the sample state continues to be known at each future time. Using a minimum 
expected length code for each initial sample state, 


H[.X” | So] < E[L(X")]min,sy < H[X” | So] + 1. (2.39) 


Since the Markov source remains in steady state, the average entropy of each source symbol 
given the state is H(X | So), so intuition suggests (and Exercise 2.32 verifies) that 


H[X” | So] = nHLX] So). (2.40) 
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Defining Diinn = E[L(X"”)]min,s,/n as the minimum expected codeword length per input symbol 
when starting in steady state, 


H[X | So] < Linn < H[X| So] + 1/n. (2.41) 


The asymptotic equipartition property (AEP) also holds for Markov sources. Here, however, 
there are!® approximately 2”"I*!5] typical strings of length n, each with probability approxi- 
mately equal to 2~"4IX!5]_ Tt follows as in the memoryless case that H[X|.S] is the minimum 
possible rate at which source symbols can be encoded subject either to unique decodability or to 
fixed-to-fixed-length encoding with small probability of failure. The arguments are essentially 
the same as in the memoryless case. 


The analysis of Markov sources will not be carried further here, since the additional required 
ideas are minor modifications of the memoryless case. Curiously, most of our insights and 
understanding about souce coding come from memoryless sources. At the same time, however, 
most sources of practical importance can be insightfully modeled as Markov and hardly any 
can be reasonably modeled as memoryless. In dealing with practical sources, we combine the 
insights from the memoryless case with modifications suggested by Markov memory. 


The AEP can be generalized to a still more general class of discrete sources called ergodic 
sources. These are essentially sources for which sample time averages converge in some proba- 
bilistic sense to ensemble averages. We do not have the machinery to define ergodicity, and the 
additional insight that would arise from studying the AEP for this class would consist primarily 
of mathematical refinements. 


2.9 Lempel-Ziv universal data compression 


The Lempel-Ziv data compression algorithms differ from the source coding algorithms studied 
in previous sections in the following ways: 


e They use variable-to-variable-length codes in which both the number of source symbols 
encoded and the number of encoded bits per codeword are variable. Moreover, the codes 
are time-varying. 


e They do not require prior knowledge of the source statistics, yet over time they adapt so 
that the average codeword length L per source symbol is minimized in some sense to be 
discussed later. Such algorithms are called universal. 


e They have been widely used in practice; they provide a simple approach to understanding 
universal data compression even though newer schemes now exist. 


The Lempel-Ziv compression algorithms were developed in 1977-78. The first, LZ77 [37], uses 
string-matching on a sliding window; the second, LZ78 [38], uses an adaptive dictionary. The 
LZ78 was algorithm was implemented many years ago in UNIX compress and in many other 
places. Implementations of LZ77 appeared somewhat later (Stac Stacker, Microsoft Windows) 
and is still widely used. 


'8There are additional details here about whether the typical sequences include the initial state or not, but 
these differences become unimportant as n becomes large. 
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In this section, the LZ77 algorithm is described. accompanied by a high-level description of why 
it works. Finally, an approximate analysis of its performance on Markov sources is given, showing 
that it is effectively optimal.!® In other words, although this algorithm operates in ignorance of 
the source statistics, it compresses substantially as well as the best algorithm designed to work 
with those statistics. 


2.9.1 The LZ77 algorithm 


The LZ77 algorithm compresses a sequence % = £1, 22,... from some given discrete alphabet 
of size M = |&|. At this point, no probabilistic model is assumed for the source, so x is simply 
a sequence of symbols, not a sequence of random symbols. A subsequence (2m, @%m-+1,--- ;2n) 
of x is represented by x7. 


The algorithm keeps the w most recently encoded source symbols in memory. This is called a 
sliding window of size w. The number w is large, and can be thought of as being in the range of 
2'0 to 27°, say. The parameter w is chosen to be a power of 2. Both complexity and, typically, 
performance increase with w. 


Briefly, the algorithm operates as follows. Suppose that at some time the source symbols a? 
have been encoded. The encoder looks for the longest match, say of length n, between the 
not-yet-encoded n-string aeie and a stored string toe starting in the window of length w. 
The clever algorithmic idea in LZ77 is to encode this string of n symbols simply by encoding 
the integers n and u; i.e., by pointing to the previous occurrence of this string in the sliding 
window. If the decoder maintains an identical window, then it can look up the string #57 }_"" 


L1—y? 
decode it, and keep up with the encoder. 


More precisely, the LZ77 algorithm operates as follows: 


1. Encode the first w symbols in a fixed-length code without compression, using [log M] bits 
per symbol. (Since w/log M] will be a vanishing fraction of the total number of encoded 
bits, the efficiency of encoding this preamble is unimportant, at least in theory.) 


2. Set the pointer P = w. (This indicates that all symbols up to and including xp have been 
encoded.) 


3. Find the largest n > 2 such that en = £p11_,, for some u in the range 1 < u < w. (Find 
the longest match between the not-yet-encoded symbols starting at P + 1 and a string of 
symbols starting in the window; let n be the length of that longest match and u the distance 
back into the window to the start of that match.) The string ee is encoded by encoding 


the integers n and u.) 


Here are two examples of finding this longest match. In the first, the length of the match 
is n = 3 and the match starts u = 7 symbols before the pointer. In the second, the length 
of the match is 4 and it starts u = 2 symbols before the pointer. Tis illustrates that that 
the string and its match can overlap. 


19 A proof of this optimality for discrete ergodic sources has been given by Wyner and Ziv [36]. 
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w = window 


\~t | 


Match Pgs: 
acdbedaebabaedobejs\abaobo<de i as:: 
“u=7 


w = window 


a n=4 
Match K+ 


acdabaacebabaeda i blabéab«<de i a::: 


US 2 


If no match exists for n > 2, then, independently of whether a match exists for n = 1, set 
n = 1 and directly encode the single source symbol xp4 1 without compression. 


4. Encode the integer n into a codeword from the unary-binary code. In the unary-binary 
code, a positive integer n is encoded into the binary representation of n, preceded by a 
prefix of |logs n| zeroes; i.e., 


n prefix base 2 expansion | codeword 
1 1 1 

2 0 10 010 

3 0 ll O11 

4 00 100 00100 

5 00 101 00101 

6 00 110 00110 

7 00 111 00111 

8 000 1000 0001000 


Thus the codewords starting with 0"1 correspond to the set of 2” integers in the range 
2k <n <2*t!_1. This code is prefix-free (picture the corresponding binary tree). It can 
be seen that the codeword for integer n has length 2|logn| + 1; it is seen later that this is 
negligible compared with the length of the encoding for u. 


5. If n > 1, encode the positive integer u < w using a fixed-length code of length log w bits. 
(At this point the decoder knows n, and can simply count back by uw in the previously 
decoded string to find the appropriate n-tuple, even if there is overlap as above.) 


6. Set the pointer P to P+ n and go to step (3). (Iterate forever.) 


2.9.2 Why LZ77 works 


The motivation behind LZ77 is information-theoretic. The underlying idea is that if the unknown 
source happens to be, say, a Markov source of entropy H[X|S$], then the AEP says that, for 
any large n, there are roughly 2”4I*!5! typical source strings of length n. On the other hand, 
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a window of size w contains w source strings of length n, counting duplications. This means 
that if w < 2"4IX!5], then most typical sequences of length n cannot be found in the window, 
suggesting that matches of length n are unlikely. Similarly, if w >> 2”4I*!5], then it is reasonable 
to suspect that most typical sequences will be in the window, suggesting that matches of length 
n or more are likely. 


The above argument, approximate and vague as it is, suggests that in order to achieve large 
typical match sizes n;, the window w should be exponentially large, on the order of 2%4[*15], 
which means 


log w 
nt & ALLS’ typical match size. (2.42) 
The encoding for a match requires log w bits for the match location and 2|logn;| +1 for the 
match size n;. Since nz is proportional to log w, log n; is negligible compared to log w for very 
large w. Thus, for the typical case, about log w bits are used to encode about nm; source symbols. 
Thus, from (2.42), the required rate, in bits per source symbol, is about L ~ H[X| S]. 


The above argument is very imprecise, but the conclusion is that, for very large window size, 
L is reduced to the value required when the source is known and an optimal fixed-to-variable 
prefix-free code is used. 


The imprecision above involves more than simply ignoring the approximation factors in the 
AEP. A more conceptual issue is that the strings of source symbols that must be encoded are 
somewhat special since they start at the end of previous matches. The other conceptual difficulty 
comes from ignoring the duplications of typical sequences within the window. 


This argument has been made precise by Wyner and Ziv [36]. 


2.9.3 Discussion 


Let us recapitulate the basic ideas behind the LZ77 algorithm: 


1. Let N, be the number of occurrences of symbol x in a window of very large size w. If 
the source satisfies the WLLN, then the relative frequency N,/w of appearances of x in 
the window will satisfy N,/w * px(ax) with high probability. Similarly, let Nyzn be the 
number of occurrences of «” which start in the window. The relative frequency Ngn/w will 
then satisfy Nan/w * pxn(x#”) with high probability for very large w. This association 
of relative frequencies with probabilities is what makes LZ77 a universal algorithm which 
needs no prior knowledge of source statistics.?° 


2. Next, as explained in the previous section, the probability of a typical source string 2” 
for a Markov source is approximately 2-""I*|5]_ If w >> 2rHIXI5], then, according to 
the previous item, Ngn © wpxn(«#”) should be large and a” should occur in the window 
with high probability. Alternatively, if w << 2"4IX!S), then &” will probably not occur. 
Consequently the match will usually occur for n + (log w)/H[X|S] as w becomes very large. 


3. Finally, it takes about log w bits to point to the best match in the window. The unary- 
binary code uses 2|logn]| + 1 bits to encode the length n of the match. For typical n, this 
is on the order of 2 log(log w/H[X|5S]) which is negigible for large enough w compared to 
log w. 


20 As Yogi Berra said, “You can observe a whole lot just by watchin’.” 
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Consequently, LZ77 requires about log w encoded bits for each group of about (log w)/H[X]| S$] 
source symbols, so it nearly achieves the optimal efficiency of L = H[X| S$] bits/symbol, as w 
becomes very large. 


Discrete sources, as they appear in practice, often can be viewed over different time scales. Over 
very long time scales, or over the sequences presented to different physical encoders running 
the same algorithm, there is often very little common structure, sometimes varying from one 
language to another, or varying from text in a language to data from something else. 


Over shorter time frames, corresponding to a single file or a single application type, there is 
often more structure, such as that in similar types of documents from the same language. Here 
it is more reasonable to view the source output as a finite length segment of, say, the output of 
an ergodic Markov source. 


What this means is that universal data compression algorithms must be tested in practice. The 
fact that they behave optimally for unknown sources that can be modeled to satisfy the AEP is 
an important guide, but not the whole story. 


The above view of different time scales also indicates that a larger window need not always 
improve the performance of the LZ77 algorithm. It suggests that long matches will be more 
likely in recent portions of the window, so that fixed length encoding of the window position is 
not the best approach. If shorter codewords are used for more recent matches, then it requires 
a shorter time for efficient coding to start to occur when the source statistics abruptly change. 
It also then makes sense to start coding from some arbitrary window known to both encoder 
and decoder rather than filling the entire window with data before starting to use the LZ77 
alogorithm. 


2.10 Summary of discrete source coding 


Discrete source coding is important both for discrete sources such as text and computer files and 
also as an inner layer for discrete-time analog sequences and fully analog sources. It is essential 
to focus on the range of possible outputs from the source rather than any one particular output. 
It is also important to focus on probabilistic models so as to achieve the best compression for the 
most common outputs with less care for very rare outputs. Even universal coding techniques, 
such as LZ77, which are designed to work well in the absence of a probability model, require 
probability models to understand and evaluate how they work. 


Variable-length source coding is the simplest way to provide good compression for common 
source outputs at the expense of rare outputs. The necessity to concatenate successive variable- 
length codewords leads to the non-probabilistic concept of unique decodability. Prefix-free codes 
provide a simple class of uniquely-decodable codes. Both prefix-free codes and the more general 
class of uniquely-decodable codes satisfy the Kraft inequality on the number of possible code 
words of each length. Moreover, for any set of lengths satisfying the Kraft inequality, there is 
a simple procedure for constructing a prefix-free code with those lengths. Since the expected 
length, and other important properties of codes, depend only on the codewords lengths (and 
how they are assigned to source symbols), there is usually little reason to use variable-length 
codes that are not also prefix free. 


For a DMS with given probabilities on the symbols of a source code, the entropy is a lower 
bound on the expected length of uniquely decodable codes. The Huffman algorithm provides a 
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simple procedure for finding an optimal (in the sense of minimum expected codeword length) 
variable-length prefix-free code. The Huffman algorithm is also useful for deriving properties 
about optimal variable length source codes (see Exercises 2.12 to 2.18). 


All the properties of variable-length codes extend immediately to fixed-to-variable-length codes. 
Here the source output sequence is segmented into blocks of n symbols, each of which is then 
encoded as a single symbol from the alphabet of source n-tuples. For a DMS the minimum 
expected codeword length per source symbol then lies between H[U] and H[U] + 1/n. Thus 
prefix-free fixed-to-variable-length codes can approach the entropy bound as closely as desired. 


One of the disadvantages of fixed-to-variable-length codes is that bits leave the encoder at a 
variable rate relative to incoming symbols. Thus if the incoming symbols have a fixed rate and 
the bits must be fed into a channel at a fixed rate (perhaps with some idle periods), then the 
encoded bits must be queued and there is a positive probability that any finite length queue will 
overflow. 


An alternative point of view is to consider fixed-length to fixed-length codes. Here, for a DMS, 
the set of possible n-tuples of symbols from the source can be partitioned into a typical set and 
an atypical set. For large n, the AEP says that there are essentially 2""l4] typical n-tuples with 
an aggregate probability approaching 1 with increasing n. Encoding just the typical n-tuples 
requires about H[U] bits per symbol, thus approaching the entropy bound without the above 
queueing problem, but, of course, with occasional errors. 


As detailed in the text, the AEP can be used to look at the long-term behavior of arbitrary 
source coding algorithms to show that the entropy bound cannot be exceeded without a failure 
rate that approaches 1. 


The above results for discrete memoryless sources extend easily to ergodic Markov sources. 
The text does not carry out this analysis in detail since readers are not assumed to have the 
requisite knowledge about Markov chains (see [7] for the detailed analysis). The important thing 
here is to see that Markov sources can model n-gram statistics for any desired n and thus can 
model fairly general sources (at the cost of very complex models). From a practical standpoint, 
universal source codes, such as LZ77 are usually a more reasonable approach to complex and 
partly unknown sources. 
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2.E Exercises 


2.1. Chapter 1 pointed out that voice waveforms could be converted to binary data by sampling 
at 8000 times per second and quantizing to 8 bits per sample, yielding 64kb/s. It then 
said that modern speech coders can yield telephone-quality speech at 6-16 kb/s. If your 
objective were simply to reproduce the words in speech recognizably without concern for 
speaker recognition, intonation, etc., make an estimate of how many kb/s would be required. 
Explain your reasoning. (Note: There is clearly no “correct answer” here; the question is 
too vague for that. The point of the question is to get used to questioning objectives and 
approaches.) 


2.2. Let V and W be discrete rv’s defined on some probability space with a joint pmf pyw(v, w). 
(a) Prove that E[V + W] = E[V] + E[W]. Do not assume independence. 
(b) Prove that if V and W are independent rv’s, then E[V - W] = E[V]- E/W]. 


c) Assume that V and W are not independent. Find an example where E[V-W] 4 E[V]-E[W] 
and another example where E[V -W] = E[V]- E[W]. 


d) Assume that V and W are independent and let o?, and o7, be the variances of V and 
W respectively. Show that the variance of V + W is given by ot, w= ot, + oan 


2.3. (a) For a nonnegative integer-valued rv N, show that E[N] = >>, ,Pr(N => n). 


(b) Show, with whatever mathematical care you feel comfortable with, that for an arbitrary 
nonnegative rv X that E(X) = f)° Pr(X > a)da. 

(c) Derive the Markov inequality, which says that for any a > 0 and any nonnegative rv X, 
Pr(X >a) < EX] Hint: Sketch Pr(X > a) as a function of a and compare the area of the 


rectangle from 0 to a on the abscissa and 0 to Pr(X > a) with the area corresponding to 
E[X]. 


n>0 


(d) Derive the Chebyshev inequality, which says that Pr(|Y — E[Y]| > b) < a4 for any rv 
Y with finite mean E[Y] and finite variance 0%. Hint: Use part (c) with (Y — E[Y])? = X. 


2.4. Let X1,X2,...,Xn,... be a sequence of independent identically distributed (iid) analog 
rv’s with the common probability density function fx(a). Note that Pr{X,=a} = 0 for all 
a and that Pr{X,=X,,} =0 for mn. 


(a) Find Pr{X1 < X2}. [Give a numerical answer, not an expression; no computation is 
required and a one or two line explanation should be adequate.| 


(b) Find Pr{.X, < X9;.X1 < X3} (in other words, find the probability that X, is the smallest 
of {X1, X2, X3}). [Again, think— don’t compute. ] 

(c) Let the rv N be the index of the first rv in the sequence to be less than Xj; that is, 
Pr{N=n} = Pr{X, < Xo; X1 < X33--- 3X1. < Xn-1;X1 > Xn}. Find Pr{N > n} asa 
function of n. Hint: generalize part (b). 


(d) Show that E[N] = oo. Hint: use part (a) of Exercise 2.3. 


(e) Now assume that X 1, X2... is a sequence of iid rv’s each drawn from a finite set of 
values. Explain why you can’t find Pr{X, < X2} without knowing the pmf. Explain why 
E|N] = co. 
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2.5. Let X1,X2,...,X, be a sequence of n binary iid rv’s. Assume that Pr{X,,=1} = 
Pri X,,=0) = 5: Let Z be a parity check on X1,...,Xn; that is, Z = X, 8X2 8---OXn 
(where 0@0=161=0and061=160=1). 


(a) Is Z independent of X;? (Assume n > 1.) 

(b) Are Z, X1,..., X,—1 independent? 

(c) Are Z, X1,... , Xn independent? 

(d) Is Z independent of X, if Pr{ X;=1} 4 3? You may take n = 2 here. 


2.6. Define a suffix-free code as a code in which no codeword is a suffix of any other codeword. 


(a) Show that suffix-free codes are uniquely decodable. Use the definition of unique decod- 
ability in Section 2.3.1, rather than the intuitive but vague idea of decodability with initial 
synchronization. 


(b) Find an example of a suffix-free code with codeword lengths (1, 2, 2) that is not a 
prefix-free code. Can a codeword be decoded as soon as its last bit arrives at the decoder? 
Show that a decoder might have to wait for an arbitrarily long time before decoding (this 
is why a careful definition of unique decodability is required). 


(c) Is there a code with codeword lengths (1, 2, 2) that is both prefix-free and suffix-free? 
Explain. 


2.7. The algorithm given in essence by (2.2) for constructing prefix-free codes from a set of 
codeword lengths uses the assumption that the lengths have been ordered first. Give an 
example in which the algorithm fails if the lengths are not ordered first. 


2.8. Suppose that, for some reason, you wish to encode a source into symbols from a D-ary 
alphabet (where D is some integer greater than 2) rather than into a binary alphabet. The 
development of Section 2.3 can be easily extended to the D-ary case, using D-ary trees 
rather than binary trees to represent prefix-free codes. Generalize the Kraft inequality, 
(2.1), to the D-ary case and outline why it is still valid. 


2.9. Suppose a prefix-free code has symbol probabilities p,,p2,...,pag and lengths ly,... , lay. 
Suppose also that the expected length L satisfies L = H[X]. 


(a) Explain why p; = 27" for each i. 


(b) Explain why the sequence of encoded binary digits is a sequence of iid equiprobable 
binary digits. Hint: Use figure 2.4 to illustrate this phenomenon and explain in words why 
the result is true in general. Do not attempt a general proof. 


2.10. (a) Show that in a code of M codewords satisfying the Kraft inequality with equality, the 
maximum length is at most M — 1. Explain why this ensures that the number of distinct 
such codes is finite. 


(b) Consider the number S(M) of distinct full code trees with M terminal nodes. Count 
two trees as being different if the corresponding set of codewords is different. That is, ignore 
the set of source symbols and the mapping between source symbols and codewords. Show 
that S(2) = 1 and show that for M > 2, S(M) = a: S(j)S(M — 7) where S(1) = 1 by 


convention. 
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2.11. (Proof of the Kraft inequality for uniquely decodable codes) (a) Assume a uniquely de- 
codable code has lengths 11,... ,l,g. In order to show that yy 2-45 < 1, demonstrate the 
following identity for each integer n > 1: 


M "MM M 
Soak = Ss" Se Lt Ss" Qj Hig to Hin) 
j=l 


ji=l je=1 jn=l1 


(b) Show that there is one term on the right for each concatenation of n codewords (1.e., 
for the encoding of one n-tuple x”) where 1;, +1), +--:+), is the aggregate length of that 
concatenation. 


(c) Let A; be the number of concatenations which have overall length i and show that 


nlmax 


M n 
Se) Sh A 
j=l i=1 


(d) Using the unique decodability, upper bound each A; and show that 


n 


j=l 
(e) By taking the nth root and letting n — oo, demonstrate the Kraft inequality. 


2.12. A source with an alphabet size of M = |A’| = 4 has symbol probabilities {1/3, 1/3, 2/9, 1/9}. 
(a) Use the Huffman algorithm to find an optimal prefix-free code for this source. 


(b) Use the Huffman algorithm to find another optimal prefix-free code with a different set 
of lengths. 


(c) Find another prefix-free code that is optimal but cannot result from using the Huffman 
algorithm. 
2.13. An alphabet of M = 4 symbols has probabilities p, > po > p3 > pa > 0. 


(a) Show that if p; = p3+ pa, then a Huffman code exists with all lengths equal and another 
exists with a codeword of length 1, one of length 2, and two of length 3. 


b) Find the largest value of pi, say Pmax, for which p; = p3 + pa is possible. 
c) Find the smallest value of pi, say pPmin, for which p; = p3 + pg is possible. 


Show that if pj > pmax, then every optimal prefix-free code has a length 1 codeword. 


( 

( 

(d) Show that if p1 > Pmax, then every Huffman code has a length 1 codeword. 
(e 

( 


) 
f) 
(g) Suppose M > 4. Find the smallest value of pi,,, such that pi > pia, guarantees that a 
Huffman code will have a length 1 codeword. 


Show that if pj < pmin, then all codewords have length 2 in every Huffman code. 


2.14. Consider a source with M equiprobable symbols. 


(a) Let k = [log M]. Show that, for a Huffman code, the only possible codeword lengths 
are k and k—1. 
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(b) As a function of M, find how many codewords have length k = [log M]. What is the 
expected codeword length L in bits per source symbol? 

(c) Define y = M/2*. Express I — log M as a function of y. Find the maximum value of 
this function over 1/2 < y < 1. This illustrates that the entropy bound, L < H[X]+1 is 
rather loose in this equiprobable case. 


2.15. Let a discrete memoryless source have M symbols with alphabet {1,2,... , } and ordered 
probabilities py > pg >--- > py > 0. Assume also that py < py_1t+pm. Let ly, le,... ,lu 
be the lengths of a prefix-free code of minimum expected length for such a source. 


(a) Show that 1, <lg <---<ly. 
(b) Show that if the Huffman algorithm is used to generate the above code, then Ijy < +1. 
Hint: Look only at the first two steps of the algorithm. 


(c) Show that ly < 1, + 1 whether or not the Huffman algorithm is used to generate a 
minimum expected length prefix-free code. 


(d) Suppose M = 2" for integer k. Determine 1,,... , ly. 
(ec) Suppose 2° < M < 2*+! for integer k. Determine ly,... ,1yy. 


2.16. (a) Consider extending the Huffman procedure to codes with ternary symbols {0, 1,2}. 
Think in terms of codewords as leaves of ternary trees. Assume an alphabet with M = 4 
symbols. Note that you cannot draw a full ternary tree with 4 leaves. By starting with a 
tree of 3 leaves and extending the tree by converting leaves into intermediate nodes, show 
for what values of M it is possible to have a complete ternary tree. 


(b) Explain how to generalize the Huffman procedure to ternary symbols, bearing in mind 
your result in part (a). 


(c) Use your algorithm for the set of probabilities {0.3, 0.2,0.2,0.1,0.1,0.1}. 


2.17. Let X have M symbols, {1,2,... ,M} with ordered probabilities py > po >--- > py > 0. 
Let X’ be the reduced source after the first step of the Huffman algorithm. 


(a) Express the entropy H[X] for the original source in terms of the entropy H[X’] of the 
reduced source as 


H[X] = H[X'] + (pm + pm-1) (9), (2.43) 


where H(7) is the binary entropy function, H(y) = —ylogy — (1—7y) log(1—y). Find the 
required value of y to satisfy (2.43). 


(b) In the code tree generated by the Huffman algorithm, let v; denote the intermediate node 
that is the parent of the leaf nodes for symbols M and M—1. Let q, = pry + pyy_i be the 
probability of reaching v1 in the code tree. Similarly, let vo,v3,... , denote the subsequent 
intermediate nodes generated by the Huffman algorithm. How many intermediate nodes 
are there, including the root node of the entire tree? 


(c) Let qi, q2,-.., be the probabilities of reaching the intermediate nodes v1, v2,... , (note 
that the probability of reaching the root node is 1). Show that L = 5°, q. Hint: Note that 
— =< 

L=L+ q1- 


(d) Express H[X] as a sum over the intermediate nodes. The ith term in the sum should 
involve q; and the binary entropy H(y;) for some y; to be determined. You may find it helpful 
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to define a; as the probability of moving upward from intermediate node v;, conditional on 
reaching v;. (Hint: look at part a). 

(e) Find the conditions (in terms of the probabilities and binary entropies above) under 
which L = H[X]. 


(f) Are the formulas for Z and H[X] above specific to Huffman codes alone, or do they 
apply (with the modified intermediate node probabilities and entropies) to arbitrary full 
prefix-free codes? 


2.18. Consider a discrete random symbol X with M+1 symbols for which p; > po >--- > py > 0 
and pyj+1 = 0. Suppose that a prefix-free code is generated for X and that for some reason, 
this code contains a codeword for M+1 (suppose for example that pa+1 is actaully positive 
but so small that it is approximated as 0). 


(a) Find L for the Huffman code including symbol M+1 in terms of L for the Huffman code 
omitting a codeword for symbol M+1. 


(b) Suppose now that instead of one symbol of zero probability, there are n such symbols. 
Repeat part (a) for this case. 


2.19. In (2.12), it is shown that if X and Y are independent discrete random symbols, then the 
entropy for the random symbol XY satisfies H[XY] = H[X]-+H[Y]. Here we want to show 
that, without the assumption of independence, we have H[XY] < H[X] + H[Y]. 


(a) Show that 


H[XY] — H[X] — H[Y] = ae) log re 


(b) Show that HLXY] — H[X] — H[Y] <0, i.e., that HLXY] < HLX] + H[Y]. 


(c) Let X1, X2,..., Xn be discrete random symbols, not necessarily independent. Use (b) 
to show that 


H[X1X2°+- Xp] < S- AIX). 
j=l 


2.20. Consider a random symbol X with the symbol alphabet {1,2,...,M} and a pmf 
{pi,p2,--.,pm}. This exercise derives a relationship called Fano’s inequality between the 
entropy H[X] and the probability p; of the first symbol. This relationship is used to prove 
the converse to the noisy channel coding theorem. Let Y be a random symbol that is 1 if 
X = 1 and 0 otherwise. For parts (a) through (d), consider M and p) to be fixed. 

(a) Express H[Y] in terms of the binary entropy function, Hy(a@) = —alog(a@)—(1—a) log(1— 

a). 

(b) What is the conditional entropy H[|X | Y=1]? 

(c) Show that HX | Y=0] < log(V —1) and show how this bound can be met with equality 
by appropriate choice of p2,... , par. Combine this with part (c) to upper bound H[X|Y]. 


(d) Find the relationship between H[X] and HLXY] 


(e) Use H[Y] and H[X|Y] to upper bound H[X] and show that the bound can be met with 
equality by appropriate choice of po,... ,pyy- 
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(f) For the same value of M as before, let p1,...,pa¢ be arbitrary and let pmax be 
max{pi,... , pyc}. Is your upper bound in (d) still valid if you replace p; by pmax? Explain. 


2.21. A discrete memoryless source emits iid random symbols X1, X9,.... Each random symbol 
X has the symbols {a,b,c} with probabilities {0.5, 0.4, 0.1}, respectively. 


(a) Find the expected length LImin of the best variable-length prefix-free code for X. 


(b) Find the expected length Lies: normalized to bits per symbol, of the best variable- 
length prefix-free code for X?. 


(c) Is it true that for any DMS, Lmin > Lmin? Explain. 


2.22. For a DMS X with alphabet ¥ = {1,2,...,M}, let Lmina, Lmin2, and Lmin,z be the 
normalized average lengths, in bits per source symbol, for a Huffman code over 4, X? and 
X° respectively. Show that Lniia S 2 Lmin,2 + 3 Lmin,1- 


2.23. (Run-Length Coding) Suppose Xj, X2,..., is a sequence of binary random symbols with 
px(a) = 0.9 and px(b) = 0.1. We encode this source by a variable-to-variable-length 
encoding technique known as run-length coding. The source output is first mapped into 
intermediate digits by counting the number of occurences of a between each successive b. 
Thus an intermediate output occurs on each occurence of the symbol b. Since we don’t 
want the intermediate digits to get too large, however, the intermediate digit 8 is used on 
the eighth consecutive a and the counting restarts at this point. Thus, outputs appear on 
each b and on each 8 a’s. For example, the first two lines below illustrate a string of source 
outputs and the corresponding intermediate outputs. 


baaabaaaaaaaaaabwb=baaaa ob 
0 3 8 2 0 4 
0000 0011 1 0010 0000 0100 
The final stage of encoding assigns the codeword 1 to the intermediate integer 8, and assigns 


a 4 bit codeword consisting of 0 followed by the three bit binary representation for each 
integer 0 to 7. This is illustrated in the third line above. 


(a) Show why the overall code is uniquely decodable. 


(b) Find the expected total number of output bits corresponding to each occurrence of the 
letter b. This total number includes the four bit encoding of the letter b and the one bit 
encoding for each consecutive string of 8 letter a’s preceding that letter b. 


(c) By considering a string of 107° binary symbols into the encoder, show that the number 
of b’s to occur per input symbol is, with very high probability, very close to 0.1. 


(d) Combine parts (b) and (c) to find the L, the expected number of output bits per input 
symbol. 
2.24. (a) Suppose a DMS emits h and t with probability 1/2 each. For ¢ = 0.01, what is T?? 
(b) Find T! for Pr(h) = 0.1, Pr(t) = 0.9, and e = 0.001. 
2.25. Consider a DMS with a two symbol alphabet, {a,b} where px(a) = 2/3 and px(b) = 1/3. 
Let X” = Xj,...,Xp be a string of random symbols from the source with n = 100, 000. 


(a) Let W(X;) be the log pmf rv for the jth source output, i.e., W(X;) = —log2/3 for 
X; =a and —log1/3 for X; = b. Find the variance of W(X;). 
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(b) For ¢ = 0.01, evaluate the bound on the probability of the typical set given in (2.24). 


(c) Let Ng be the number of a’s in the string X" = X1,..., Xn. The rv Ng is the sum of 
n iid rv’s. Show what these rv’s are. 


(d) Express the rv W(X”) as a function of the rv Na. Note how this depends on n. 


(e) Express the typical set in terms of bounds on Ng (i.e., T2? = {@": a < Na < 3} and 
calculate a and {3). 


(f) Find the mean and variance of Ng. Approximate Pr{T!’} by the central limit theorem 
approximation. The central limit theorem approximation is to evaluate Pr{T?"} assuming 
that N, is Gaussian with the mean and variance of the actual Ng. 


One point of this exercise is to illustrate that the Chebyshev inequality used in bounding 
Pr(T7-) in the text is very weak (although it is a strict bound, whereas the Gaussian approx- 
imation here is relatively accurate but not a bound). Another point is to show that n must 
be very large for the typical set to look typical. 


2.26. For the rv’s in the previous exercise, find Pr{Nq = 7} for i = 0,1,2. Find the probability 
of each individual string «” for those values of 7. Find the particular string «” that has 
maximum probability over all sample values of X”. What are the next most probable 
n-strings? Give a brief discussion of why the most probable n-strings are not regarded as 
typical strings. 


2.27. Let X1,Xo,..., be a sequence of iid symbols from a finite alphabet. For any block length 
n and any small number ¢e > 0, define the good set of n-tuples x™ as the set 


Ge = {x" > pxn(x”) > an . 


(a) Explain how G? differs from the typical set T?’. 


2 
(b) Show that Pr(G?) > 1—- =: where W is the log pmf rv for X. Nothing elaborate is 
expected here. 


(c) Derive an upper bound on the number of elements in G® of the form |G2| < 2”(HI*]+e) 


and determine the value of a. (You are expected to find the smallest such a that you can, 
but not to prove that no smaller value can be used in an upper bound). 

(d) Let G2 — T? be the set of n-tuples x” that lie in G? but not in T?. Find an upper 
bound to |G? — T”| of the form |G? — T?| < 2(HIX1+9)_ Again find the smallest 3 that you 


can. 


(e) Find the limit of |G? — T?|/|T?| as n — oo. 


2.28. The typical set T!’ defined in the text is often called a weakly typical set, in contrast to 
another kind of typical set called a strongly typical set. Assume a discrete memoryless 
source and let Nj(a#") be the number of symbols in an n string x” taking on the value j. 
Then the strongly typical set S” is defined as 


Nj(a" 
= {2" : pj(l—e) < ie ) <pj(l+e); for allj € xh. 
ny) Nj (x) 
(a) Show that px»(#") =[];p; : 
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(b) Show that every x” in S” has the property that 


=H alg 
H[X](1 — ¢) < —“S 2% 2") < HEX] +e) 
(c) Show that if 2” € SP, then x” € TS with e’ = H[X]e, ie., that SP C TH. 


(d) Show that for any 6 > 0 and all sufficiently large n, 


Pr(X" ¢ 8) <5 
Hint:Taking each letter 7 separately, 1 < 7 < M, show that for all sufficiently large n, 
p(n |e) = 
(e) Show that for all 6 > 0 and all suffiently large n, 

(ego nes ze: sep es pte), (2.44) 


Note that parts (d) and (e) constitute the same theorem for the strongly typical set as 
Theorem 2.7.1 establishes for the weakly typical set. Typically the n required for (2.44) to 
hold (with the above correspondence between ¢€ and €) is considerably larger than than that 
for (2.27) to hold. We will use strong typicality later in proving the noisy channel coding 
theorem. 


2.29. (a) The random variable D,, in Subsection 2.7.4 was defined as the initial string length of 
encoded bits required to decode the first n symbols of the source input. For the run-length 
coding example in Exercise 2.23, list the input strings and corresponding encoded output 
strings that must be inspected to decode the first source letter and from this find the pmf 
function of D,;. Hint: As many as 8 source letters must be encoded before X1 can be 
decoded. 


(b)Find the pmf of Dz. One point of this exercise is to convince you that Dp, is a useful 
rv for proving theorems, but not a rv that is useful for detailed computation. It also shows 
clearly that D, can depend on more than the first n source letters. 


2.30. The Markov chain Sp, 51,... below starts in steady state at time 0 and has 4 states, S = 
{1, 2,3,4}. The corresponding Markov source X1, X2,... has a source alphabet V = {a,b,c} 


of size 3. 
by 172 
be (3) 
a; 1/2 Wie 
a; 1 enij2 
roel 


(a) Find the steady-state probabilities {¢q(s)} of the Markov chain. 
(c) Find H[X4| So]. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


2.E. EXERCISES 61 


(d) Describe a uniquely-decodable encoder for which L = H[X1|So). Assume that the initial 
state is known to the decoder. Explain why the decoder can track the state after time 0. 


(e) Suppose you observe the source output without knowing the state. What is the maximum 


number of source symbols you must observe before knowing the state? 


2.31. Let X1, X9,...,Xn be discrete random symbols. Derive the following chain rule: 


H[Xa,... Xn] = H[Xi] + SO HEX Xa,... , Xp—a] 
k=2 


Hint: Use the chain rule for n = 2 in (2.37) and ask yourself whether a & tuple of random 
symbols is itself a random symbol. 


2.32. Consider a discrete ergodic Markov chain Sp, $1,... with an arbitrary initial state distribu- 
tion. 
(a) Show that H[S2|S1S9] = H[.S2|S1] (use the basic definition of conditional entropy). 
(b) Show with the help of Exercise 2.31 that for any n > 2, 


H[$1S2 +++ Sn|So] = 5° H[S¢| 5x1]. 
k=1 


(c) Simplify this for the case where So is in steady state. 


(d) For a Markov source with outputs X,X2---, explain why H[X,---X,|So] = 
H[S;---S;,|So]. You may restrict this to n = 2 if you desire. 


(e) Verify (2.40). 


2.33. Perform an LZ77 parsing of the string 000111010010101100. Assume a window of length 
W = 8; the initial window is underlined above. You should parse the rest of the string using 
the Lempel-Ziv algorithm. 


2.34. Suppose that the LZ77 algorithm is used on the binary string cas = P0901 20001000. 


This notation means 5000 repetitions of 0 followed by 4000 repetitions of 1 followed by 1000 
repetitions of 0. Assume a window size w = 1024. 


(a) Describe how the above string would be encoded. Give the encoded string and describe 
its substrings. 


(b) How long is the encoded string? 


(c) Suppose that the window size is reduced to w = 8. How long would the encoded string 
be in this case? (Note that such a small window size would only work well for really simple 
examples like this one.) 


(d) Create a Markov source model with 2 states that is a reasonably good model for this 
source output. You are not expected to do anything very elaborate here; just use common 
sense. 


(e) Find the entropy in bits per source symbol for your source model. 
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2.35. (a) Show that if an optimum (in the sense of minimum expected length) prefix-free code is 
chosen for any given pmf (subject to the condition p; > p; for i < j), the code word lengths 
satisfy 1; <1, for all i < j. Use this to show that for all 7 > 1 


lL; > [log j] +1 


(b) The asymptotic efficiency of a prefix-free code for the positive integers is defined to be 
lim j—oo ss What is the asymptotic efficiency of the unary-binary code? 
(c) Explain how to construct a prefix-free code for the positive integers where the asymptotic 


efficiency is 1. Hint: Replace the unary code for the integers n = |log j| + 1 in the unary- 
binary code with a code whose length grows more slowly with increasing n. 
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Chapter 3 


Quantization 


3.1 Introduction to quantization 


The previous chapter discussed coding and decoding for discrete sources. Discrete sources are 
a subject of interest in their own right (for text, computer files, etc.) and also serve as the 
inner layer for encoding analog source sequences and waveform sources (see Figure 3.1). This 
chapter treats coding and decoding for a sequence of analog values. Source coding for analog 
values is usually called quantization. Note that this is also the middle layer for waveform 


encoding /decoding. 
input F discrete 
————+ sampler quantizer 
waveform encoder 
reliable 
analog symbol binary 
sequence sequence channel 
output analog table discrete 
waveform | filter lookup decoder 


Figure 3.1: Encoding and decoding of discrete sources, analog sequence sources, and 
waveform sources. Quantization, the topic of this chapter, is the middle layer and 
should be understood before trying to understand the outer layer, which deals with 
waveform sources. 


The input to the quantizer will be modeled as a sequence Uj, Uo,--- , of analog random variables 
(rv’s). The motivation for this is much the same as that for modeling the input to a discrete 
source encoder as a sequence of random symbols. That is, the design of a quantizer should be 
responsive to the set of possible inputs rather than being designed for only a single sequence of 
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numerical inputs. Also, it is desirable to treat very rare inputs differently from very common 
inputs, and a probability density is an ideal approach for this. Initially, Ui, U2,... will be taken 
as independent identically distributed (iid) analog rv’s with some given probability density 
function (pdf) fy(u). 


A quantizer, by definition, maps the incoming sequence Uj, U2,--- , into a sequence of discrete 
rv’s Vi, Vo,--- , where the objective is that V,,, for each m in the sequence, should represent U;, 
with as little distortion as possible. Assuming that the discrete encoder/decoder at the inner 
layer of Figure 3.1 is uniquely decodable, the sequence Vj, V2,--- will appear at the output of 
the discrete encoder and will be passed through the middle layer (denoted ‘table lookup’) to 
represent the input U,,U2,---. The output side of the quantizer layer is called a ‘table lookup’ 
because the alphabet for each discrete random variables V,, is a finite set of real numbers, and 
these are usually mapped into another set of symbols such as the integers 1 to M for an M 
symbol alphabet. Thus on the output side a look-up function is required to convert back to the 
numerical value V,,. 


As discussed in Section 2.1, the quantizer output V,,,, if restricted to an alphabet of M possible 
values, cannot represent the analog input U,, perfectly. Increasing M, i.e., quantizing more 
finely, typically reduces the distortion, but cannot eliminate it. 


When an analog rv U is quantized into a discrete rv V, the mean-squared distortion is defined 
to be E[(U—V)?]. Mean-squared distortion (often called mean-sqared error) is almost invari- 
ably used in this text to measure distortion. When studying the conversion of waveforms into 
sequences in the next chapter, it will be seen that mean-squared distortion is a particularly 
convenient measure for converting the distortion for the sequence into the distortion for the 
waveform. 


There are some disadvantages to measuring distortion only in a mean-squared sense. For ex- 
ample, efficient speech coders are based on models of human speech. They make use of the fact 
that human listeners are more sensitive to some kinds of reconstruction error than others, so as, 
for example, to permit larger errors when the signal is loud than when it is soft. Speech coding 
is a specialized topic which we do not have time to explore (see, for example, [10]. However, 
understanding compression relative to a mean-squared distortion measure will develop many of 
the underlying principles needed in such more specialized studies. 


In what follows, scalar quantization is considered first. Here each analog rv in the sequence is 
quantized independently of the other rv’s. Next vector quantization is considered. Here the 
analog sequence is first segmented into blocks of n rv’s each; then each n-tuple is quantized as 
a unit. 


Our initial approach to both scalar and vector quantization will be to minimize mean-squared 
distortion subject to a constraint on the size of the quantization alphabet. Later, we consider 
minimizing mean-squared distortion subject to a constraint on the entropy of the quantized 
output. This is the relevant approach to quantization if the quantized output sequence is to be 
source-encoded in an efficient manner, 7.e., to reduce the number of encoded bits per quantized 
symbol to little more than the corresponding entropy. 
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3.2 Scalar quantization 


A scalar quantizer partitions the set R of real numbers into M subsets Rj,..., Ray, called 
quantization regions. Assume that each quantization region is an interval; it will soon be seen 
why this assumption makes sense. Each region R; is then represented by a representation point 
a; € R. When the source produces a number u € Rj, that number is quantized into the point 
a;. A scalar quantizer can be viewed as a function {v(u) : R — R} that maps analog real values 
u into discrete real values v(u) where u(u) = a; for u € Rj. 


An analog sequence u1,ug,... of real-valued symbols is mapped by such a quantizer into the 
discrete sequence v(u1),v(u2)... . Taking u1,u2..., as sample values of a random sequence 
U;,U2,..., the map v(u) generates an rv V; for each Uz; V;, takes the value a; if U, € R;. Thus 
each quantized output Vz is a discrete rv with the alphabet {a1,... ,a,¢}. The discrete random 
sequence V;, V2,... , is encoded into binary digits, transmitted, and then decoded back into the 
same discrete sequence. For now, assume that transmission is error-free. 

We first investigate how to choose the quantization regions Rj,...,R az, and how to choose 
the corresponding representation points. Initially assume that the regions are intervals, ordered 
as in Figure 3.2, with Ry = (—oo, by], Re = (b1, b9],..., Ra = (by_-1i,00). Thus an M-level 
quantizer is specified by M — 1 interval endpoints, b),... ,b;¢_1, and M representation points, 
Q1,---,aM. 


= | | | 
ay | ag | a3 | a4 | as | a6 
Figure 3.2: Quantization regions and representation points. 


For a given value of M, how can the regions and representation points be chosen to minimize 
mean-squared error? This question is explored in two ways: 


e Given a set of representation points {a;}, how should the intervals {R,;} be chosen? 


e Given a set of intervals {R;}, how should the representation points {a;} be chosen? 


3.2.1 Choice of intervals for given representation points 


The choice of intervals for given representation points, {a;;1<j<M} is easy: given any u € R, 
the squared error to a; is (wu —aj,)?. This is minimized (over the fixed set of representation 
points {a;}) by representing u by the closest representation point a;. This means, for example, 
that if u is between a; and aj41, then u is mapped into the closer of the two. Thus the 
boundary b; between Rj and Rj+1 must lie halfway between the representation points a; and 
aj41,1 <j < M—1. That is, 6; = eens This specifies each quantization region, and also 
shows why each region should be an interval. Note that this minimization of mean-squared 
distortion does not depend on the probabilistic model for U;,U2,... . 
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3.2.2 Choice of representation points for given intervals 


For the second question, the probabilistic model for U;,U2,... is important. For example, if 
it is known that each U; is discrete and has only one sample value in each interval, then the 
representation points would be chosen as those sample value. Suppose now that the rv’s {U;,} 
are iid analog rv’s with the pdf fy(u). For a given set of points {a;}, V(U) maps each sample 
value u € R; into aj. The mean-squared distortion (or mean-squared error MSE) is then 


me M 
MSB=E(U-VUY]= f fol (u-vw)Pdu= > f folw(u-ay)? du. 8.1 
ue Pies 


In order to minimize (3.1) over the set of a;, it is simply necessary to choose each a; to minimize 
the corresponding integral (remember that the regions are considered fixed here). Let f;(u) 
denote the conditional pdf of U given that {u € Rj}; ie., 


fu(u) if Re 
eat. te ea 32 
file) 0, otherwise, 2) 
where Q; = Pr{U € R;}. Then, for the interval R,, 
i. OCMC arr a; | Han da (3.3) 
Rj Rj 


Now (3.3) is minimized by choosing a; to be the mean of a random variable with the pdf f;(wu). 
To see this, note that for any rv Y and real number a, 


(Y — a)? =Y? — 2aY +0’, 


which is minimized over a whena=Y. 


This provides a set of conditions that the endpoints {b;} and the points {a;} must satisfy to 
achieve the MSE — namely, each b; must be the midpoint between a; and aj+; and each a; 
must be the mean of an rv U; with pdf f;(u). In other words, a; must be the conditional mean 
of U conditional on U € Rj. 


These conditions are necessary to minimize the MSE for a given number M of representation 
points. They are not sufficient, as shown by an example at the end of this section. Nonetheless, 
these necessary conditions provide some insight into the minimization of the MSE. 


3.2.3 The Lloyd-Max algorithm 


The Lloyd-Maz algorithm! is an algorithm for finding the endpoints {b;} and the representation 
points {a;} to meet the above necessary conditions. The algorithm is almost obvious given the 
necessary conditions; the contribution of Lloyd and Max was to define the problem and develop 
the necessary conditions. The algorithm simply alternates between the optimizations of the 
previous subsections, namely optimizing the endpoints {b;} for a given set of {a;}, and then 
optimizing the points {a;} for the new endpoints. 


‘This algorithm was developed independently by S. P. Lloyd in 1957 and J. Max in 1960. Lloyd’s work was 
done in the Bell Laboratories research department and became widely circulated, although unpublished until 1982 
[16]. Max’s work [18] was published in 1960. 
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The Lloyd-Max algorithm is as follows. Assume that the number M of quantizer levels and the 
pdf fu(u) are given. 


1. Choose an arbitrary initial set of M representation points a1 < ag <---< ay. 
2. For each j;1 < j < M-1, set b; = $(aj;41 +45). 


3. For each j;1 < j < M, set a; equal to the conditional mean of U given U € (b;-1, bj] (where 
bp and bay are taken to be —oo and +00 respectively). 


4. Repeat steps (2) and (3) until further improvement in MSE is negligible; then stop. 


The MSE decreases (or remains the same) for each execution of step (2) and step (3). Since the 
MSE is nonnegative, it approaches some limit. Thus if the algorithm terminates when the MSE 
improvement is less than some given ¢ > 0, then the algorithm must terminate after a finite 
number of iterations. 


Example 3.2.1. This example shows that the algorithm might reach a local minimum of MSE 
instead of the global minimum. Consider a quantizer with M = 2 representation points, and an 
tv U whose pdf fy(u) has three peaks, as shown in Figure 3.3. 


fu(u) 


AA 
YY 


| | | 
ay a2 


Figure 3.3: Example of regions and representaion points that satisfy Lloyd-Max condi- 
tions without minimizing mean-squared distortion. 


It can be seen that one region must cover two of the peaks, yielding quite a bit of distortion, 
while the other will represent the remaining peak, yielding little distortion. In the figure, the 
two rightmost peaks are both covered by Re, with the point ag between them. Both the points 
and the regions satisfy the necessary conditions and cannot be locally improved. However, it 
can be seen in the figure that the rightmost peak is more probable than the other peaks. It 
follows that the MSE would be lower if R, covered the two leftmost peaks. 


The Lloyd-Max algorithm is a type of hill-climbing algorithm; starting with an arbitrary set of 
values, these values are modified until reaching the top of a hill where no more local improvements 
are possible.2 A reasonable approach in this sort of situation is to try many randomly chosen 
starting points, perform the Lloyd-Max algorithm on each and then take the best solution. This 
is somewhat unsatisfying since there is no general technique for determining when the optimal 
solution has been found. 


It would be better to call this a valley-descending algorithm, both because a minimum is desired and also 
because binoculars can not be used at the bottom of a valley to find a distant lower valley. 
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3.3 Vector quantization 


As with source coding of discrete sources, we next consider quantizing n source variables at a 
time. This is called vector quantization, since an n-tuple of rv’s may be regarded as a vector 
rv in an n-dimensional vector space. We will concentrate on the case n = 2 so that illustrative 
pictures can be drawn. 


One possible approach is to quantize each dimension independently with a scalar (one- 
dimensional) quantizer. This results in a rectangular grid of quantization regions as shown 
below. The MSE per dimension is the same as for the scalar quantizer using the same number 
of bits per dimension. Thus the best 2D vector quantizer has an MSE per dimension at least as 
small as that of the best scalar quantizer. 


Figure 3.4: 2D rectangular quantizer. 


To search for the minimum-MSE 2D vector quantizer with a given number M of representation 
points, the same approach is used as with scalar quantization. 


Let (U, U’) be the two rv’s being jointly quantized. Suppose a set of M 2D representation points 
{(a;, a';)}, 1 <j <M is chosen. For example, in the figure above, there are 16 representation 
points, represented by small dots. Given a sample pair (u,u’) and given the M representation 
points, which representation point should be chosen for the given (u, wu’)? Again, the answer is 
easy. Since mapping (wu, u’) into (aj, aj) generates a squared error equal to (u— a,j)? +(u' — a's), 
the point (aj, a';) which is closest to (uw, u’) in Euclidean distance should be chosen. 


Consequently, the region R; must be the set of points (u,u’) that are closer to (aj, a’) than 
to any other representation point. Thus the regions {R;} are minimum-distance regions; these 
regions are called the Voronoi regions for the given representation points. The boundaries of 
the Voronoi regions are perpendicular bisectors between neighboring representation points. The 
minimum-distance regions are thus in general convex polygonal regions, as illustrated in the 
figure below. 


As in the scalar case, the MSE can be minimized for a given set of regions by choosing the 
representation points to be the conditional means within those regions. Then, given this new 
set of representation points, the MSE can be further reduced by using the Voronoi regions for 
the new points. This gives us a 2D version of the Lloyd-Max algorithm, which must converge 
to a local minimum of the MSE. This can be generalized straightforwardly to any dimension n. 


As already seen, the Lloyd-Max algorithm only finds local minima to the MSE for scalar quan- 
tizers. For vector quantizers, the problem of local minima becomes even worse. For example, 
when U,U2,--- are iid, it is easy to see that the rectangular quantizer in Figure 3.4 satisfies 
the Lloyd-Max conditions if the corresponding scalar quantizer does (see Exercise 3.10). It will 
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Figure 3.5: Voronoi regions for given set of representation points. 


soon be seen, however, that this is not necessarily the minimum MSE. 


Vector quantization was a popular research topic for many years. The problem is that quantizing 
complexity goes up exponentially with n, and the reduction in MSE with increasing n is quite 
modest, unless the samples are statistically highly dependent. 


3.4 Entropy-coded quantization 


We must now ask if minimizing the MSE for a given number M of representation points is the 
right problem. The minimum expected number of bits per symbol, Lmin, required to encode the 
quantizer output was shown in Chapter 2 to be governed by the entropy H[V] of the quantizer 
output, not by the size M of the quantization alphabet. Therefore, anticipating efficient source 
coding of the quantized outputs, we should really try to minimize the MSE for a given entropy 
H[V] rather than a given number of representation points. 


This approach is called entropy-coded quantization and is almost implicit in the layered approach 
to source coding represented in Figure 3.1. Discrete source coding close to the entropy bound 
is similarly often called entropy coding. Thus entropy-coded quantization refers to quantization 
techniques that are designed to be followed by entropy coding. 


The entropy H[V] of the quantizer output is determined only by the probabilities of the quantiza- 
tion regions. Therefore, given a set of regions, choosing the representation points as conditional 
means minimizes their distortion without changing the entropy. However, given a set of rep- 
resentation points, the optimal regions are not necessarily Voronoi regions (e.g., in a scalar 
quantizer, the point separating two adjacent regions is not necessarily equidistant from the two 
represention points.) 


For example, for a scalar quantizer with a constraint H[V] < $ and a Gaussian pdf for U, a 


reasonable choice is three regions, the center one having high probability 1 — 2p and the outer 


ones having small, equal probability p, such that H[V] = 5: 


Even for scalar quantizers, minimizing MSE subject to an entropy constraint is a rather messy 
problem. Considerable insight into the problem can be obtained by looking at the case where 
the target entropy is large— i.e., when a large number of points can be used to achieve small 
MSE. Fortunately this is the case of greatest practical interest. 


Example 3.4.1. For the following simple example, consider the minimum-MSE quantizer using 
a constraint on the number of representation points MZ compared to that using a constraint on 
the entropy H[V]. 
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Figure 3.6: Comparison of constraint on M to constraint on H[U]. 


The example shows a piecewise constant pdf fy(u) that takes on only two positive values, say 
fu(u) = fi over an interval of size L1, and fy(u) = f2 over a second interval of size Lz. Assume 
that fy(u) = 0 elsewhere. Because of the wide separation between the two intervals, they can 
be quantized separately without providing any representation point in the region between the 
intervals. Let M, and Mz be the number of representation points in each interval. In the figure, 
M, = 9 and Mz = 7. Let A; = L,/M, and Ag = L2/Mz be the lengths of the quantization 
regions in the two ranges (by symmetry, each quantization region in a given interval should have 
the same length). The representation points are at the center of each quantization interval. 
The MSE, conditional on being in a quantization region of length A,;, is the MSE of a uniform 
distribution over an interval of length A;, which is easily computed to be A? /12. The probability 
of being in a given quantization region of size A; is f;A;, so the overall MSE is given by 


MSE = M,Sipf,a SNe hh Dre es (3.4) 
ga eee e422 75 et 12 2022: : 


This can be minimized over A; and Ag subject to the constraint that M = M, + Moy = 
[,/A, + L2/Az. Ignoring the constraint that M, and M2 are integers (which makes sense 
for M large), Exercise 3.4 shows that the minimum MSE occurs when A, is chosen inversely 
proportional to the cube root of f;. In other words, 


A 1/3 
ae = (2) , (3.5) 


This says that the size of a quantization region decreases with increasing probability density. 
This is reasonable, putting the greatest effort where there is the most probability. What is 
perhaps surprising is that this effect is so small, proportional only to a cube root. 


Perhaps even more surprisingly, if the MSE is minimized subject to a constraint on entropy for 
this example, then Exercise 3.4 shows that, in the limit of high rate, the quantization intervals 
all have the same length! A scalar quantizer in which all intervals have the same length is called 
a uniform scalar quantizer. The following sections will show that uniform scalar quantizers have 
remarkable properties for high-rate quantization. 


3.5 High-rate entropy-coded quantization 


This section focuses on high-rate quantizers where the quantization regions can be made sufhi- 
ciently small so that the probability density is approximately constant within each region. It will 
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be shown that under these conditions the combination of a uniform scalar quantizer followed by 
discrete entropy coding is nearly optimum (in terms of mean-squared distortion) within the class 
of scalar quantizers. This means that a uniform quantizer can be used as a universal quantizer 
with very little loss of optimality. The probability distribution of the rv’s to be quantized can 
be explointed at the level of discrete source coding. Note however that this essential optimality 
of uniform quantizers relies heavily on the assumption that mean-squared distortion is an ap- 
propriate distortion measure. With voice coding, for example, a given distortion at low signal 
levels is for more harmful than the same distortion at high signal levels. 


In the following sections, it is assumed that the source output is a sequence Uj, U2,... , of iid 
real analog-valued rv’s, each with a probability density fy(u). It is further assumed that the 
probability density function (pdf) fy(u) is smooth enough and the quantization fine enough 
that fu(u) is almost constant over each quantization region. 


The analogue of the entropy H[X] of a discrete rv is the differential entropy h[U] of an analog 
rv. After defining h[U],the properties of H{U] and h[U] will be compared. 


The performance of a uniform scalar quantizer followed by entropy coding will then be analyzed. 
It will be seen that there is a tradeoff between the rate of the quantizer and the mean-squared 
error (MSE) between source and quantized output. It is also shown that the uniform quantizer 
is essentially optimum among scalar quantizers at high rate. 


The performance of uniform vector quantizers followed by entropy coding will then be analyzed 
and similar tradeoffs will be found. A major result is that vector quantizers can achieve a gain 
over scalar quantizers (7.e., a reduction of MSE for given quantizer rate), but that the reduction 
in MSE is at most a factor of 7e/6 = 1.42. 


The changes in MSE for different quantization methods, and similarly, changes in power levels on 
channels, are invariably calculated by communication engineers in decibels (dB). The number of 
decibels corresponding to a reduction of a in the mean squared error is defined to be 10 logy, a. 
The use of a logarithmic measure allows the various components of mean squared error or power 
gain to be added rather than multiplied. 


The use of decibels rather than some other logarithmic measure such as natural logs or logs to 
the base 2 is partly motivated by the ease of doing rough mental calculations. A factor of 2 is 
10 logy) 2 = 3.010--- dB, approximated as 3 dB. Thus 4 = 2? is 6 dB and 8 is 9 dB. Since 10 
is 10 dB, we also see that 5 is 10/2 or 7 dB. We can just as easily see that 20 is 13 dB and so 
forth. The limiting factor of 1.42 in MSE above is then a reduction of 1.53 dB. 


As in the discrete case, generalizations to analog sources with memory are possible, but not 
discussed here. 


3.6 Differential entropy 


The differential entropy h{U] of an analog random variable (rv) U is analogous to the entropy 
H[X] of a discrete random symbol X. It has many similarities, but also some important differ- 
ences. 


Definition The differential entropy of an analog real rv U with pdf fy(u) is 


h[U] = / eG logy (a) ae 
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The integral may be restricted to the region where fy(u) > 0, since Olog0 is interpreted as 0. 
Assume that fy(u) is smooth and that the integral exists with a finite value. Exercise 3.7 gives 
an example where h(U) is infinite. 


As before, the logarithms are base 2 and the units of h{U] are bits per source symbol. 


Like H[X], the differential entropy h[U] is the expected value of the rv — log fy(U). The log of 
the joint density of several independent rv’s is the sum of the logs of the individual pdf’s, and 
this can be used to derive an AEP similar to the discrete case. 


Unlike H[X], the differential entropy h{U] can be negative and depends on the scaling of the 
outcomes. This can be seen from the following two examples. 


Example 3.6.1 (Uniform distributions). Let fy(u) be a uniform distribution over an inter- 
val [a,a + A] of length A; i.e., fu(u) = 1/A for u € [a,a+ Al, and fy(u) = 0 elsewhere. Then 
— log fu(u) = log A where fy(u) > 0 and 


h[U] = E[—log fyu(U)] = log A. 


Example 3.6.2 (Gaussian distribution). Let fy(u) be a Gaussian distribution with mean 


m and variance 07; i.e., 
1 (u—m)? 
folw) = yf rcp ow {SI 


Then — log fy(u) = 5 log 270? + (log e)(u — m)?/(207). Since E[((U — m)?] = 0?, 
1 30, 1 : 
h[U] = E[—log fy(U)] = 3 les(200 )+ 3 lose — 5 log(2mec ye 


It can be seen from these expressions that by making A or o? arbitrarily small, the differen- 
tial entropy can be made arbitrarily negative, while by making A or o? arbitrarily large, the 
differential entropy can be made arbitrarily positive. 


If the rv U is rescaled to aU for some scale factor a > 0, then the differential entropy is increased 
by log a, both in these examples and in general. In other words, h{U] is not invariant to scaling. 
Note, however, that differential entropy is invariant to translation of the pdf, 7.e., an rv and its 
fluctuation around the mean have the same differential entropy. 


One of the important properties of entropy is that it does not depend on the labeling of the 
elements of the alphabet, i.e., it is invariant to invertible transformations. Differential entropy 
is very different in this respect, and, as just illustrated, it is modified by even such a trivial 
transformation as a change of scale. The reason for this is that the probability density is a 
probability per unit length, and therefore depends on the measure of length. In fact, as seen 
more clearly later, this fits in very well with the fact that source coding for analog sources also 
depends on an error term per unit length. 


Definition The differential entropy of an n-tuple of rv’s U" = (Uj,--- ,Un) with joint pdf 
fun(u") is 

h[U"] = E[-log fy»(U")]. 
Like entropy, differential entropy has the property that if U and V are independent rv’s, then 
the entropy of the joint variable UV with pdf fuy(u,v) = fu(u) fv (v) is h[UV] = h[U] + h[V]. 
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Again, this follows from the fact that the log of the joint probability density of independent rv’s 
is additive, i.e., —log fyy(u, v) = — log fy(u) — log fy (v). 

Thus the differential entropy of a vector rv U”, corresponding to a string of n iid rv’s 
U,,U2,...,Un, each with the density fy(u), is h[U"] = nh[U]. 


3.7 Performance of uniform high-rate scalar quantizers 


This section analyzes the performance of uniform scalar quantizers in the limit of high rate. 
Appendix A continues the analysis for the nonuniform case and shows that uniform quantizers 
are effectively optimal in the high-rate limit. 

For a uniform scalar quantizer, every quantization interval R; has the same length |R,;| = A. 
In other words, R (or the portion of R over which fy(u) > 0), is partitioned into equal intervals, 
each of length A. 


+—A— 
. R_1 an Ro > Ry ait Ro an Rez Ra an ie 
ele Sa Ue ave eye <I Sage We aes Is cage | 


Figure 3.7: Uniform scalar quantizer. 


Assume there are enough quantization regions to cover the region where fy(u) > 0. For the 
Gaussian distribution, for example, this requires an infinite number of representation points, 
—oo < j < o. Thus, in this example the quantized discrete rv V has a countably infinite 
alphabet. Obviously, practical quantizers limit the number of points to a finite region R such 
that fp fu(u) due 1. 


Assume that A is small enough that the pdf fy(u) is approximately constant over any one 
quantization interval. More precisely, define f(u) (see Figure 3.8) as the average value of fy(u) 
over the quantization interval containing u, 


(Qe aA for u€ R;. (3.6) 


From (3.6) it is seen that Af(u) = Pr(R;) for all integer j and all u € Rj. 


J(u) 


| — fy (u) 


Figure 3.8: Average density over each Rj. 


The high-rate assumption is that fy(u) © f(u) for all u € R. This means that fy(u) + Pr(R;)/A 
for u € Rj. It also means that the conditional pdf fy;r,(u) of U conditional on u € R; is 
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approximated by 
a, 1/A, we R;; 
fuir; (4) © { 0, ud Rj. 


Consequently the conditional mean a; is approximately in the center of the interval R;, and the 
mean-squared error is approximately given by 


A/2 1 A2 
MSE & / —u?du = —~ (3.7) 


for each quantization interval R;. Consequently this is also the overall MSE. 


Next consider the entropy of the quantizer output V. The probability p; that V = a; is given 
by both 


B= ‘, fu(u) du and, for allue R;, pj; = f(ujA. (3.8) 
R; 


Therefore the entropy of the discrete rv V is 


HIV] = So-pjlogp; = 7 [ — fru) log{F(u) A] du 
i pe 
= f fol) os{F(u)a] au (3.9) 
= is —fu(u) log[f(u)] du —log A, (3.10) 


where the sum of disjoint integrals were combined into a single integral. 


Finally, using the high-rate approximation® fy(u) + f(u), this becomes 


[o-e) 
HV] =f —fulw)toslfo(wa) au 
—oo 
= h[U]—logA. (3.11) 
Since the sequence Uj, U2,... of inputs to the quantizer is memoryless (iid), the quantizer output 
sequence Vj, V2,... is an iid sequence of discrete random symbols representing quantization 


points— i.e., a discrete memoryless source. A uniquely-decodable source code can therefore 
be used to encode this output sequence into a bit sequence at an average rate of L ~ H[V] = 
h[U]—log A bits/symbol. At the receiver, the mean-squared quantization error in reconstructing 
the original sequence is approximately MSE ~ A?/12. 


The important conclusions from this analysis are illustrated in Figure 3.9 and are summarized 
as follows: 


e Under the high-rate assumption, the rate L for a uniform quantizer followed by discrete 
entropy coding depends only on the differential entropy h[U] of the source and the spacing 
A of the quantizer. It does not depend on any other feature of the source pdf fy(u), nor on 
any other feature of the quantizer, such as the number M of points, so long as the quantizer 
intervals cover fy(u) sufficiently completely and finely. 


3Exercise 3.6 provides some insight into the nature of the approximation here. In particular, the difference 
between h[U] — log A and H[V] is f fu(u) log[f(u)/fu(u)] du. This quantity is always nonpositive and goes to 
zero with A as A?. Similarly, the approximation error on MSE goes to 0 as A‘. 
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e The rate L ~ H[V] and the MSE are parametrically related by A, i.e., 


a2: A2 
Lx h(U) — log A; MSE = Toe (3.12) 


Note that each reduction in A by a factor of 2 will reduce the MSE by a factor of 4 
and increase the required transmission rate L ~ H[V] by 1 bit/symbol. Communication 
engineers express this by saying that each additional bit per symbol decreases the mean- 
squared distortion’ by 6 dB. Figure 3.9 sketches MSE as a function of L. 


MSE 


g2h[U]-2L 


MSE 7) 


Figure 3.9: MSE as a function of L for a scalar quantizer with the high-rate approxi- 
mation. Note that changing the source entropy h(U) simply shifts the figure right or 
left. Note also that log MSE is linear, with a slope of -2, as a function of L. 


Conventional b-bit analog-to-digital (A/D) converters are uniform scalar 2°-level quantizers that 
cover a certain range R with a quantizer spacing A = 2~°|R|. The input samples must be scaled 
so that the probability that u ¢ R (the “overflow probability”) is small. For a fixed scaling of 
the input, the tradeoff is again that increasing b by 1 bit reduces the MSE by a factor of 4. 


Conventional A/D converters are not usually directly followed by entropy coding. The more 
conventional approach is to use A/D conversion to produce a very high rate digital signal that 
can be further processed by digital signal processing (DSP). This digital signal is then later 
compressed using algorithms specialized to the particular application (voice, images, etc.). In 
other words, the clean layers of Figure 3.1 oversimplify what is done in practice. On the other 
hand, it is often best to view compression in terms of the Figure 3.1 layers, and then use DSP 
as a way of implementing the resulting algorithms. 


The relation H[V] ~ h[u] — logA provides an elegant interpretation of differential entropy. 
It is obvious that there must be some kind of tradeoff between MSE and the entropy of the 
representation, and the differential entropy specifies this tradeoff in a very simple way for high 
rate uniform scalar quantizers. H[V] is the entropy of a finely quantized version of U, and the 
additional term log A relates to the “uncertainty” within an individual quantized interval. It 
shows explicitly how the scale used to measure U affects h{U]. 


Appendix A considers nonuniform scalar quantizers under the high rate assumption and shows 
that nothing is gained in the high-rate limit by the use of nonuniformity. 


4A quantity x expressed in dB is given by 10 log;)z. This very useful and common logarithmic measure is 
discussed in detail in Chapter 6. 
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3.8 High-rate two-dimensional quantizers 


The performance of uniform two-dimensional (2D) quantizers are now analyzed in the limit of 
high rate. Appendix B considers the nonuniform case and shows that uniform quantizers are 
again effectively optimal in the high-rate limit. 


A 2D quantizer operates on 2 source samples u = (ui, ug) at a time; i.e., the source alphabet 
is U = R?. Assuming iid source symbols, the joint pdf is then fy(u) = fy(u1) fu (u2), and the 
joint differential entropy is h[U] = 2h[U]. 

Like a uniform scalar quantizer, a uniform 2D quantizer is based on a fundamental quantization 
region R (“quantization cell”) whose translates tile® the 2D plane. In the one-dimensional case, 
there is really only one sensible choice for R, namely an interval of length A, but in higher 
dimensions there are many possible choices. For two dimensions, the most important choices 
are squares and hexagons, but in higher dimensions, many more choices are available. 


Notice that if a region R tiles R?, then any scaled version aR of R will also tile R?, and so will 
any rotation or translation of R. 


Consider the performance of a uniform 2D quantizer with a basic cell R which is centered at the 
origin 0. The set of cells, which are assumed to tile the region, are denoted by® {R,; j € ZT} 
where R; = a; + R and a; is the center of the cell Rj. Let A(R) = J du be the area of the 
basic cell. The average pdf in a cell R; is given by Pr(R;)/A(R;). As before, define f(w) to be 
the average pdf over the region #,; containing u. The high-rate assumption is again made, 7.e., 
assume that the region R is small enough that fy(w) ~ f(w) for all w. 


The assumption fy(u) ~ f(u) implies that the conditional pdf, conditional on u € R; is 
approximated by 


(3.13) 


fuir,(u) SS { 1/A(R), Uc Ri; 


0, U ¢ Re 


The conditional mean is approximately equal to the center a; of the region R;. The mean- 
squared error per dimension for the basic quantization cell ? centered on 0 is then approximately 
equal to 


1 al 


The right side of (3.14) is the MSE for the quantization area R using a pdf equal to a constant; it 
will be denoted MSE. The quantity |||] is the length of the vector w1, ug, so that ||u||? = uf+u3. 
Thus MSE, can be rewritten as 


cS avr 2,,2 1 
MSE = MSE, = 5 [we + us) AR) du;duz. (3.15) 


MSE, is measured in units of squared length, just like A(R). Thus the ratio G(R) = MSE,/A(R) 
is a dimensionless quantity called the normalized second moment. With a little effort, it can 


5A region of the 2D plane is said to tile the plane if the region, plus translates and rotations of the region, 
fill the plane without overlap. For example the square and the hexagon tile the plane. Also, rectangles tile the 
plane, and equilateral triangles with rotations tile the plane. 

°Z,* denotes the set of positive integers, so {Rj; 7 € Z*} denotes the set of regions in the tiling, numbered in 
some arbitrary way of no particular interest here. 
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be seen that G(#) is invariant to scaling, translation and rotation. G(R) does depend on the 
shape of the region R, and, as seen below, it is G(R) that determines how well a given shape 
performs as a quantization region. By expressing 


MSE, = G(R) A(R), 
it is seen that the MSE is the product of a shape term and an area term, and these can be 


chosen independently. 


As examples, G(R) is given below for some common shapes. 


e Square: For a square A on a side, A(R) = A?. Breaking (3.15) into two terms, we see that 
each is identical to the scalar case and MSE, = A?/12. Thus G(Square) = 1/12. 


e Hexagon: View the hexagon as the union of 6 equilateral triangles A on a side. Then 
A(R) = 3V3A?/2 and MSE, = 5A?/24. Thus G(hexagon) = 5/(36V3). 


e Circle: For a circle of radius r, A(R) = tr? and MSE, = r?/4 so G(circle) = 1/(47). 


The circle is not an allowable quantization region, since it does not tile the plane. On the other 
hand, for a given area, this is the shape that minimizes MSE,. To see this, note that for any 
other shape, differential areas further from the origin can be moved closer to the origin with a 
reduction in MSE,. That is, the circle is the 2D shape that minimizes G(R). This also suggests 
why G(Hexagon) < G(Square), since the hexagon is more concentrated around the origin than 
the square. 


Using the high rate approximation for any given tiling, each quantization cell R; has the same 
shape and area and has a conditional pdf which is approximately uniform. Thus MSE, approx- 
imates the MSE for each quantization region and thus approximates the overall MSE. 


Next consider the entropy of the quantizer output. The probability that U falls in the region 
R; is 


i= i. fu(u) du and, foralluc Rj, pj = f(u)A(R). 
R; 


The output of the quantizer is the discrete random symbol V with the pmf p; for each symbol 
j. As before, the entropy of V is given by 


H[V] = —  pjlogp; 
j 
= ae [ _Fu(u)loat F(a) A(R] du 


= = f fou) log F(u) + log A(R)] du 
~ = f folu) [log fu(u)| du + log A(R) 
= 2h[U] —log A(R), 
where the high rate approximation fy(u) ~ f(w) was used. Note that, since U = U,Uz for iid 


variables U; and U2, the differential entropy of U is 2h{U]. 
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Again, an efficient uniquely-decodable source code can be used to encode the quantizer output 
sequence into a bit sequence at an average rate per source symbol of 


oe: 1 
ae mY ~ h[U] — 5 log A(R) bits/symbol. (3.16) 


At the receiver, the mean-squared quantization error in reconstructing the original sequence will 
be approximately equal to the MSE given in (3.14). 


We have the following important conclusions for a uniform 2D quantizer under the high-rate 
approximation: 


e Under the high-rate assumption, the rate L depends only on the differential entropy h{U] of 
the source and the area A(R) of the basic quantization cell R. It does not depend on any 
other feature of the source pdf fy(u), and does not depend on the shape of the quantizer 
region, i.e., it does not depend on the normalized second moment G(R). 


e There is a tradeoff between the rate Z and MSE that is governed by the area A(R). From 
(3.16), an increase of 1 bit/symbol in rate corresponds to a decrease in A(R) by a factor of 
4. From (3.14), this decreases the MSE by a factor of 4, 7.e., by 6 dB. 


The ratio G(Square)/G(Hexagon) is equal to 33/5 = 1.0392. This is called the quantizing 
gain of the hexagon over the square. For a given A(R) (and thus a given L), the MSE for a 
hexagonal quantizer is smaller than that for a square quantizer (and thus also for a scalar 
quantizer) by a factor of 1.0392 (0.17 dB). This is a disappointingly small gain given the 
added complexity of 2D and hexagonal regions and suggests that uniform scalar quantizers 
are good choices at high rates. 


3.9 Summary of quantization 


Quantization is important both for digitizing a sequence of analog signals and as the middle 
layer in digitizing analog waveform sources. Uniform scalar quantization is the simplest and 
often most practical approach to quantization. Before reaching this conclusion, two approaches 
to optimal scalar quantizers were taken. The first attempted to minimize the expected distortion 
subject to a fixed number M of quantization regions, and the second attempted to minimize 
the expected distortion subject to a fixed entropy of the quantized output. Each approach was 
followed by the extension to vector quantization. 


In both approaches, and for both scalar and vector quantization, the emphasis was on minimizing 
mean square distortion or error (MSE), as opposed to some other distortion measure. As will 
be seen later, MSE is the natural distortion measure in going from waveforms to sequences of 
analog values. For specific sources, such as speech, however, MSE is not appropriate. For an 
introduction to quantization, however, focusing on MSE seems appropriate in building intuition; 
again, our approach is building understanding through the use of simple models. 


The first approach, minimizing MSE with a fixed number of regions, leads to the Lloyd-Max 
algorithm, which finds a local minimum of MSE. Unfortunately, the local minimum is not 
necessarily a global minimum, as seen by several examples. For vector quantization, the problem 
of local (but not global) minima arising from the Lloyd-Max algorithm appears to be the typical 
case. 
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The second approach, minimizing MSE with a constraint on the output entropy is also a diffi- 
cult problem analytically. This is the appropriate approach in a two layer solution where the 
quantizer is followed by discrete encoding. On the other hand, the first approach is more appro- 
priate when vector quantization is to be used but cannot be followed by fixed-to-variable-length 
discrete source coding. 


High-rate scalar quantization, where the quantization regions can be made sufficiently small so 
that the probability density in almost constant over each region, leads to a much simpler result 
when followed by entropy coding. In the limit of high rate, a uniform scalar quantizer minimizes 
MSE for a given entropy constraint. Moreover, the tradeoff between Minimum MSE and output 
entropy is the simple univeral curve of Figure 3.9. The source is completely characterized by 
its differential entropy in this tradeoff. The approximations in this result are analyzed in Exer- 
cise 3.6. Two-dimensional vector quantization under the high-rate approximation with entropy 
coding leads to a similar result. Using a square quantization region to tile the plane, the trade- 
off between MSE per symbol and entropy per symbol is the same as with scalar quantization. 
Using a hexagonal quantization region to tile the plane reduces the MSE by a factor of 1.0392, 
which seems hardly worth the trouble. It is possible that non-uniform two-dimensional quan- 
tizers might achieve a smaller MSE than a hexagonal tiling, but this gain is still limited by the 
circular shaping gain, which is 7/3 = 1.0472 (0.2 dB). Using non-uniform quantization regions 
at high rate leads to a lowerbound on MSE which is lower than that for the scalar uniform 
quantizer by a factor of 1.0472, which, even if achievable, is scarcely worth the trouble. 


The use of high-dimensional quantizers can achieve slightly higher gains over the uniform scalar 
quantizer, but the gain is still limited by a fundamental information-theoretic result to 7e/6 = 
1.423 (1.53 dB). 


3A Appendix A: Nonuniform scalar quantizers 


This appendix shows that the approximate MSE for uniform high-rate scalar quantizers in Sec- 
tion 3.7 provides an approximate lower bound on the MSE for any nonuniform scalar quantizer, 
again using the high-rate approximation that the pdf of U is constant within each quantiza- 
tion region. This shows that in the high-rate region, there is little reason to further consider 
nonuniform scalar quantizers. 


Consider an arbitrary scalar quantizer for an rv U with a pdf fu(u). Let A; be the width of the 
jth quantization interval, i.e, A; = |R;|. As before, let f(u) be the average pdf within each 
quantization interval, 7.e., 


d 
fwe= _ for we R;. 
j 


The high-rate approximation is that fy(u) is approximately constant over each quantization 
region. Equivalently, fu(u) © f(u) for all u. Thus, if region R; has width A,, the conditional 
mean a; of U over R; is approximately the midpoint of the region, and the conditional mean- 


squared error, MSE;, given UCRj, is approximately As oi. 


Let V be the quantizer output, i.e., the discrete rv such that V = a; whenever U € Rj. The 
probability p; that V=a, is p; = Sr; fu(u) du 
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The unconditional mean-squared error, i.e.. E[(U — V)?] is then given by 
A2 A2 
MSE ee —2 du. al 
sy ae» ff fos em (3.17) 


This can be simplified by defining A(u) = A; for u € Rj. Since each u is in R; for some J, this 
defines A(u) for all wu € R. Substituting this in (3.17), 


u)2 
MSE & > | fo AW) du (3.18) 
9] J 
love) u)2 
= / Foti) AW) His (3.19) 


Next consider the entropy of V. As in (3.8), the following relations are used for p; 


y= i: fu(u) du and, forallue Rj, pj = f(u)A(u). 
Rj 


HIV] = >) -pjlogp; 
Jj 
= SO] -fu(u)log[F(wA(u)] du (3.20) 
j j 
= ic —fu(u) log[f(u)A(u)] du, (3.21) 


where the multiple integrals over disjoint regions have been combined into a single integral. The 


high-rate approximation fy(u) © f(u) is next substituted into (3.21). 


H[V] 


2 


/ ” =fyla)los fy @AGa) da 


= h{U]- a fu(u) log A(u) du. (3.22) 


Note the similarity of this to (3.11). 


The next step is to minimize the mean-squared error subject to a constraint on the entropy 
H|V]. This is done approximately by minimizing the approximation to MSE in (3.22) subject 
to the approximation to H[V] in (3.19). Exercise 3.6 provides some insight into the accuracy of 
these approximations and their effect on this minimization. 


Consider using a Lagrange multiplier to perform the minimization. Since MSE decreases as 
H[V] increases, consider minimizing MSE + AH[V]. As A increases, MSE will increase and H{V] 
decrease in the minimizing solution. 


In principle, the minimization should be constrained by the fact that A(u) is constrained to 
represent the interval sizes for a realizable set of quantization regions. The minimum of MSE + 
AH[V] will be lower bounded by ignoring this constraint. The very nice thing that happens is that 
this unconstrained lower bound occurs where A(u) is constant. This corresponds to a uniform 
quantizer, which is clearly realizable. In other words, subject to the high-rate approximation, 
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the lower bound on MSE over all scalar quantizers is equal to the MSE for the uniform scalar 
quantizer. To see this, use (3.19) and (3.22), 


MSE + AH[V]_ © [- fu(u) au: du + dh[U] af” fu(u) log A(u) du 


= anul+ f fo {9 


This is minimized over all choices of A(u) > 0 by simply minimizing the expression inside the 
braces for each real value of u. That is, for each u, differentiate the quantity inside the braces 
with respect to A(u), getting A(u)/6 — A(loge)/A(u). Setting the derivative equal to 0, it 
is seen that A(u) = \/A(loge)/6. By taking the second derivative, it can be seen that this 
solution actually minimizes the integrand for each u. The only important thing here is that the 
minimizing A(u) is independent of u. This means that the approximation of MSE is minimized, 
subject to a constraint on the approximation of H[V], by the use of a uniform quantizer. 


— log Aww} du. (3.23) 


The next question is the meaning of minimizing an approximation to something subject to 
a constraint which itself is an approximation. From Exercise 3.6, it is seen that both the 
approximation to MSE and that to H[V] are good approximations for small A, 7.e., for high- 
rate. For any given high-rate nonuniform quantizer then, consider plotting MSE and H[V] on 
Figure 3.9. The corresponding approximate values of MSE and H[V] are then close to the plotted 
value (with some small difference both in the ordinate and abscissa). These approximate values, 
however, lie above the approximate values plotted in Figure 3.9 for the scalar quantizer. Thus, 
in this sense, the performance curve of MSE versus H[V] for the approximation to the scalar 
quantizer either lies below or close to the points for any nonuniform quantizer. 


In summary, it has been shown that for large H[V] (7.e., high-rate quantization), a uniform 
scalar quantizer approximately minimizes MSE subject to the entropy constraint. There is 
little reason to use nonuniform scalar quantizers (except perhaps at low rate). Furthermore the 
MSE performance at high-rate can be easily approximated and depends only on h{U] and the 
constraint on H[V]. 


3B Appendix B: Nonuniform 2D quantizers 


For completeness, the performance of nonuniform 2D quantizers is now analyzed; the analysis 
is very similar to that of nonuniform scalar quantizers. Consider an arbitrary set of quantiza- 
tion intervals {R,;}. Let A(R,;) and MSE; be the area and mean-squared error per dimension 
respectively of Rj, i.e., 


zs lf iju=a;l? 
A(R;) 8 du ; MSE; = ae ACR) du, 


where a; is the mean of R;. For each region R; and each u € Rj, let f(u) = Pr(R;)/A(R;) be 
the average pdf in Rk;. Then 


r= [ fulu) du =F(u)A(Ry). 
Rj 
The unconditioned mean-squared error is then 


MSE = 5 © p; MSEj. 
j 
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Let A(u) = A(R;) and MSE(w) = MSE; for u € A;. Then, 

MSE = | fol) MSE(u) du. (3.24) 
Similarly, 


H[V] = S° —pjlogp; 


J 
= ' ~fy(u) log[F(u)A(w)] de 


2 


/ —fu(u) logifu(u)A(u)] du (3.25) 


2h[U] — [tow log[A(u)] du. (3.26) 


A Lagrange multiplier can again be used to solve for the optimum quantization regions under 
the high-rate approximation. In particular, from (3.24) and (3.26), 


MSE + AH[V] © A2h[U] + i. fu(u) {MSE(u) — Alog A(u)} du. (3.27) 


Since each quantization area can be different, the quantization regions need not have geometric 
shapes whose translates tile the plane. As pointed out earlier, however, the shape that minimizes 
MSE, for a given quantization area is a circle. Therefore the MSE can be lower bounded in the 
Lagrange multiplier by using this shape. Replacing MSE(w) by A(w)/(47) in (3.27), 


Ae) 


MSE + AH[V] © 2Ah[U] + fu(u) { i 
R2 us 


— rlog Au) } du. (3.28) 
Optimizing for each w separately, A(w) = 47 loge. The optimum is achieved where the same 
size circle is used for each point u (independent of the probability density). This is unrealizable, 
but still provides a lower bound on the MSE for any given H[V] in the high-rate region. The 
reduction in MSE over the square region is 7/3 = 1.0472 (0.2 dB). It appears that the uniform 
quantizer with hexagonal shape is optimal, but this figure of 7/3 provides a simple bound to 
the possible gain with 2D quantizers. Either way, the improvement by going to two dimensions 
is small. 


The same sort of analysis can be carried out for n dimensional quantizers. In place of using a 
circle as a lower bound, one now uses an nm dimensional sphere. As n increases, the resulting 
lower bound to MSE approaches a gain of 7e/6 = 1.4233 (1.53 dB) over the scalar quantizer. 
It is known from a fundamental result in information theory that this gain can be approached 
arbitrarily closely as n — oo. 
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3.E Exercises 


3.1. Let U be an analog rv (rv) uniformly distributed between —1 and 1. 
(a) Find the three-bit (IM = 8) quantizer that minimizes the mean-squared error. 
(b) Argue that your quantizer satisfies the necessary conditions for optimality. 


(c) Show that the quantizer is unique in the sense that no other 3-bit quantizer satisfies the 
necessary conditions for optimality. 


3.2. Consider a discrete-time, analog source with memory, 1.e., U;,U2,... are dependent rv’s. 
Assume that each U; is uniformly distributed between 0 and 1 but that U2, = U2n—, for 
each n > 1. Assume that {U2,}°2, are independent. 


(a) Find the one-bit (V7 = 2) scalar quantizer that minimizes the mean-squared error. 
(b) Find the mean-squared error for the quantizer that you have found in (a). 


(c) Find the one-bit-per-symbol (IZ = 4) two-dimensional vector quantizer that minimizes 
the MSE. 


(d) Plot the two-dimensional regions and representation points for both your scalar quantizer 
in part (a) and your vector quantizer in part (c). 


3.3. Consider a binary scalar quantizer that partitions the reals R into two subsets, (—oo, b] and 
(b, oo) and then represents (—oo, b] by a1 € R and (b, 00) by ag € R. This quantizer is used 
on each letter U, of a sequence --- ,U_1,Uo,Uj,--- of iid random variables, each having 
the probability density f(u). Assume throughout this exercise that f(u) is symmetric, 7.e., 
that f(u) = f(—u) for all u > 0. 


(a) Given the representation levels a1 and az > a1, how should b be chosen to minimize the 
mean square distortion in the quantization? Assume that f(u) > 0 for aj < u < ag and 
explain why this assumption is relevant. 


(b) Given b > 0, find the values of a; and az that minimize the mean square distortion. Give 
both answers in terms of the two functions Q(x) = [°° f(u) du and y(z) = J uf(u) du. 
(c) Show that for b = 0, the minimizing values of a, and ag satisfy a, = —ag. 

(d) Show that the choice of b,a,, and ag in part (c) satisfies the Lloyd-Max conditions for 
minimum mean square distortion. 


(e) Consider the particular symmetric density below 


1 1 1 
3e 3e 3e 
> KE > K-E > K-—E 
f(u) 
| | | 
I I I 
-l 0 1 


Find all sets of triples, {b,a1,a2} that satisfy the Lloyd-Max conditions and evaluate the 
MSE for each. You are welcome in your calculation to replace each region of non-zero 
probability density above with an impulse i.e., f(u) = $[5(—1) + 6(0) + 6(1)], but you 
should use the figure above to resolve the ambiguity about regions that occurs when b is -1, 
0, or +1. 
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(f) Give the MSE for each of your solutions above (in the limit of e — 0). Which of your 
solutions minimizes the MSE? 


3.4. In Section 3.4, we partly analyzed a minimum-MSE quantizer for a pdf in which fy(u) = fi 
over an interval of size Li, fy(u) = f2 over an interval of size Lz and fy(u) = 0 elsewhere. 
Let M be the total number of representation points to be used, with M, in the first interval 
and Mj = M — M, in the second. Assume (from symmetry) that the quantization intervals 
are of equal size Ay = L,/M;, in interval 1 and of equal size Ag = L2/Mz in interval 2. 
Assume that M is very large, so that we can approximately minimize the MSE over Mj, M2 
without an integer constraint on M,, M2 (that is, assume that M,, M2 can be arbitrary real 
numbers). 


(a) Show that the MSE is minimized if Meee = Ash, i.e., the quantization interval 


sizes are inversely proportional to the cube root of the density. [Hint: Use a Lagrange 
multiplier to perform the minimization. That is, to minimize a function MSE(Aj, A2) 
subject to a constraint M = f(A;, Ao), first minimize MSE(A), As) + Af(Ai, Ae) without 
the constraint, and, second, choose so that the solution meets the constraint.] 


(b) Show that the minimum MSE under the above assumption is given by 


3 
(Lift + Loh) 
12M? , 


MSE = 


(c) Assume that the Lloyd-Max algorithm is started with 0 < M, < M representation 
points in the first interval and Mz = M — M, points in the second interval. Explain where 
the Lloyd-Max algorithm converges for this starting point. Assume from here on that the 
distance between the two intervals is very large. 

(d) Redo part (c) under the assumption that the Lloyd-Max algorithm is started with 
0 < M, < M —2 representation points in the first interval, one point between the two 
intervals, and the remaining points in the second interval. 

(e) Express the exact minimum MSE as a minimum over M — 1 possibilities, with one term 
for each choice of 0 < M, < M (assume there are no representation points between the two 
intervals). 

(f) Now consider an arbitrary choice of A; and Ag (with no constraint on M/). Show that 
the entropy of the set of quantization points is 


A(V) = —fili log(fiA1) — fol2log( fog). 


(g) Show that if we minimize the MSE subject to a constraint on this entropy (ignoring the 
integer constraint on quantization levels), then A; = Ag. 


3.5. Assume that a continuous valued rv Z has a probability density that is 0 except over the 
interval [—A, +A]. Show that the differential entropy h(Z) is upper bounded by 1+ log, A. 
(b) Show that h(Z) = 1+ log, A if and only if Z is uniformly distributed between —A and 
+A. 

3.6. Let fu(u) = 1/2+u for0<u<1and fy(u) = 0 elsewhere. 


(a) For A < 1, consider a quantization region R = (x,x+ A] for0 <2 <1-—A. Find the 
conditional mean of U conditional on U € R. 
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(b) Find the conditional mean-squared error (MSE) of U conditional on U € R. Show that, 
as A goes to 0, the difference between the MSE and the approximation A?/12 goes to 0 as 
NS, 

(c) For any given A such that 1/A = M, M a positive integer, let {Rj = ((j-1)A, jA]} be 
the set of regions for a uniform scalar quantizer with M quantization intervals. Show that 
the difference between h{[U] — log A and H[V] as given (3.10) is 


1 
h[U] — log A — HIV] = i fu(u) log F(u)/fr(u)] du. 


(d) Show that the difference in (3.6) is nonnegative. Hint: use the inequality nz < x —1. 
Note that your argument does not depend on the particular choice of fy(u). 

(e) Show that the difference h[U] — log A — H[V] goes to 0 as A? as A > 0. Hint: Use the 
approximation Ina ~& (a—1)—(x—1)?/2, which is the second-order Taylor series expansion 
of Ina around « = 1. 

The major error in the high-rate approximation for small A and smooth fy(u) is due to 
the slope of fy(u). Your results here show that this linear term is insignificant for both 
the approximation of MSE and for the approximation of H[V]. More work is required to 
validate the approximation in regions where fy(u) goes to 0. 


3.7. (Example where h(U) is infinite.) Let fu;(u) be given by 


i for we Se 
= u(In u) = 
fu(u) 0 for u<e, 


(a) Show that fy(u) is non-negative and integrates to 1. 
(b) Show that h(U) is infinite. 
( 


c) Show that a uniform scalar quantizer for this source with any separation A (0 < A < oo) 
has infinite entropy. Hint: Use the approach in Exercise 3.6, parts (c, d.) 


3.8. (Divergence and the extremal property of Gaussian entropy) The divergence between two 
probability densities f(a) and g(x) is defined by 


LEN 


Dif) = [flan t= 


(a) Show that D(f||g) > 0. Hint: use the inequality ny < y—1 for y > 0 on —D(f\lg). 
You may assume that g(x) > 0 where f(x) > 0. 

(b) Let f° a? f(x) dx = 0? and let g(x) = d(x) where (2) is the density of the rv V(0, 0”). 
Express D(f||¢(x)) in terms of the differential entropy (in nats) of a rv with density f(z). 


(c) Use (a) and (b) to show that the Gaussian rv (0, 0?) has the largest differential entropy 
of any rv with variance o? and that that differential entropy is 5 In(27e0). 


3.9. Consider a discrete source U with a finite alphabet of N real numbers, r1 < rg <--- < ry 
with the pmf p; > 0,...,pny > 0. The set {r1,... , rn} is to be quantized into a smaller set 
of M < N representation points a] < ag <--:<ay. 
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(a) Let Ri, R2,...,Ryay be a given set of quantization intervals with R, = (—oo, bi], Re = 
(b1, ba],..., Ra = (bu—1,00). Assume that at least one source value r; is in R; for each 
j, 1 <j <M and give a necessary condition on the representation points {a;} to achieve 
minimum MSE. 


(b) For a given set of representation points a1,... ,a,¢ assume that no symbol r; lies exactly 
halfway between two neighboring a;, i.e., that r; ~ oreith for all 7,7. For each r;, find 
the interval R; (and more specifically the representation point a;) that r; must be mapped 
into to minimize MSE. Note that it is not necessary to place the boundary b; between Rj; 
and Rj+41 at bj = (aj; + aj+1)/2 since there is no probability in the immediate vicinity of 
(aj + a541)/2. 

(c) For the given representation points, a1,... ,a@,z, now assume that r; = oreith for some 
source symbol r; and some j. Show that the MSE is the same whether r; is mapped into 
aj; or into aj41. 

(d) For the assumption in part c), show that the set {a;} cannot possibly achieve minimum 
MSE. Hint: Look at the optimal choice of a; and aj+1 for each of the two cases of part c). 


3.10. Assume an iid discrete-time analog source U;,U2,--- and consider a scalar quantizer that 
satisfies the Lloyd-Max conditions. Show that the rectangular 2-dimensional quantizer based 
on this scalar quantizer also satisfies the Lloyd-Max conditions. 


3.11. (a) Consider a square two dimensional quantization region R defined by —4 <u < S and 
4 <u2< A. Find MSE, as defined in (3.15) and show that it’s proportional to A?. 


(b) Repeat part (a) with A replaced by aA. Show that MSE,./A(R) (where A(R) is now 
the area of the scaled region) is unchanged. 


(c) Explain why this invariance to scaling of MSE,/A(R) is valid for any two dimensional 
region. 
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Chapter 4 


Source and channel waveforms 


4.1 Introduction 


This chapter has a dual objective. The first is to understand analog data compression, i.e., 
the compression of sources such as voice for which the output is an arbitrarily varying real or 
complex valued function of time; we denote such functions as waveforms. The second is to begin 
studying the waveforms that are typically transmitted at the input and received at the output of 
communication channels. The same set of mathematical tools are needed for the understanding 
and representation of both source and channel waveforms; the development of these results is 
the central topic in this chapter. 


These results about waveforms are standard topics in mathematical courses on analysis, real 
and complex variables, functional analysis, and linear algebra. They are stated here without the 
precision or generality of a good mathematics text, but with considerably more precision and 
interpretation than is found in most engineering texts. 


4.1.1 Analog sources 


The output of many analog sources (voice is the typical example) can be represented as a 
waveform,! {u(t) : R — R} or fu(t) : R — C}. Often, as with voice, we are interested only 
in real waveforms, but the simple generalization to complex waveforms is essential for Fourier 
analysis and for baseband modeling of communication channels. Since a real valued function 
can be viewed as a special case of a complex valued function, the results for complex functions 
are also useful for real functions. 


We observed earlier that more complicated analog sources such as video can be viewed as 
mappings from R” to R, e.g., as mappings from horizontal/vertical position and time to real 
analog values, but for simplicity we consider only waveform sources here. 


Recall why it is desirable to convert analog sources into bits: 


e The use of a standard binary interface separates the problem of compressing sources from 


‘The notation {u(t) : R > R} refers to a function that maps each real number ¢ € R into another real number 
u(t) € R. Similarly, {u(t) : R — C} maps each real number ¢ € R into a complex number u(t) € C. These 
functions of time, i.e., these waveforms, are usually viewed as dimensionless, thus allowing us to separate physical 
scale factors in communication problems from the waveform shape. 


87 
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the problems of channel coding and modulation. 


e The outputs from multiple sources can be easily multiplexed together. Multiplexers can 
work by interleaving bits, 8-bit bytes, or longer packets from different sources. 


e When a bit sequence travels serially through multiple links (as in a network), the noisy bit 
sequence can be cleaned up (regenerated) at each intermediate node, whereas noise tends 
to gradually accumulate with noisy analog transmission. 


A common way of encoding a waveform into a bit sequence is as follows: 


1. Approximate the analog waveform {u(t);t € R} by its samples? {u(mT);m € Z} at regularly 
spaced sample times, ... ,—7,0,7,27,.... 


2. Quantize each sample (or n-tuple of samples) into a quantization region. 


3. Encode each quantization region (or block of regions) into a string of bits. 


These three layers of encoding are illustrated in Figure 4.1, with the three corresponding layers 
of decoding. 


input | ; | discrete 
—————+ sampler }|— +} quantizer }|— 
waveform | | encoder 
| | reliable 
analog symbol binary 
sequence sequence channel 
| | 
output analog table discrete 
waveform | filter lookup decoder 


Figure 4.1: Encoding and decoding a waveform source. 


Example 4.1.1. In standard telephony, the voice is filtered to 4000 Hertz (4 kHz) and then 
sampled at 8000 samples per second.*? Each sample is then quantized to one of 256 possible 
levels, represented by 8 bits. Thus the voice signal is represented as a 64 kilobit/second (kb/s) 
sequence. (Modern digital wireless systems use more sophisticated voice coding schemes that 
reduce the data rate to about 8 kb/s with little loss of voice quality.) 


The sampling above may be generalized in a variety of ways for converting waveforms into 
sequences of real or complex numbers. For example, modern voice compression techniques first 


Z, denotes the set of integers, —oo < m < 00, so {u(mT);m € Z} denotes the doubly infinite sequence of 
samples with —oo < m < co 

°The sampling theorem, to be discussed in Section 4.6, essentially says that if a waveform is baseband-limited 
to W Hz, then it can be represented perfectly by 2W samples per second. The highest note on a piano is about 4 
kHz, which is considerably higher than most voice frequencies. 
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segment the voice waveform into 20 msec. segments and then use the frequency structure of 
each segment to generate a vector of numbers. The resulting vector can then be quantized and 
encoded as discussed before. 


An individual waveform from an analog source should be viewed as a sample waveform from a 
random process. The resulting probabilistic structure on these sample waveforms then deter- 
mines a probability assignment on the sequences representing these sample waveforms. This 
random characterization will be studied in Chapter 7; for now, the focus is on ways to map de- 
terministic waveforms to sequences and vice versa. These mappings are crucial both for source 
coding and channel transmission. 


4.1.2 Communication channels 


Some examples of communication channels are as follows: a pair of antennas separated by open 
space; a laser and an optical receiver separated by an optical fiber; and a microwave transmitter 
and receiver separated by a wave guide. For the antenna example, a real waveform at the 
input in the appropriate frequency band is converted by the input antenna into electromagnetic 
radiation, part of which is received at the receiving antenna and converted back to a waveform. 
For many purposes, these physical channels can be viewed as black boxes where the output 
waveform can be described as a function of the input waveform and noise of various kinds. 


Viewing these channels as black boxes is another example of layering. The optical or microwave 
devices or antennas can be considered as an inner layer around the actual physical channel. 
This layered view will be adopted here for the most part, since the physics of antennas, optics, 
and microwave are largely separable from the digital communication issues developed here. One 
exception to this is the description of physical channels for wireless communication in Chapter 
9. As will be seen, describing a wireless channel as a black box requires some understanding of 
the underlying physical phenomena. 


The function of a channel encoder, 7.e., a modulator, is to convert the incoming sequence of 
binary digits into a waveform in such a way that the noise corrupted waveform at the receiver 
can, with high probability, be converted back into the original binary digits. This is typically 
done by first converting the binary sequence into a sequence of analog signals, which are then 
converted to a waveform. This procession - bit sequence to analog sequence to waveform - is the 
same procession as performed by a source decoder, and the opposite to that performed by the 
source encoder. How these functions should be accomplished is very different in the source and 
channel cases, but both involve converting between waveforms and analog sequences. 


The waveforms of interest for channel transmission and reception should be viewed as sample 
waveforms of random processes (in the same way that source waveforms should be viewed as 
sample waveforms from a random process). This chapter, however, is concerned only about the 
relationship between deterministic waveforms and analog sequences; the necessary results about 
random processes will be postponed until Chapter 7. The reason why so much mathematical 
precision is necessary here, however, is that these waveforms are a priori unknown. In other 
words, one cannot use the conventional engineering approach of performing some computation 
on a function and assuming it is correct if an answer emerges’. 


4This is not to disparage the use of computational (either hand or computer) techniques to get a quick answer 
without worrying about fine points. Such techniques often provides insight and understanding, and the fine points 
can be addressed later. For a random process, however, one doesn’t know a priori which sample functions can 
provide computational insight. 
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4.2 Fourier series 


Perhaps the simplest example of an analog sequence that can represent a waveform comes from 
the Fourier series. The Fourier series is also useful in understanding Fourier transforms and 
discrete-time Fourier transforms (DTFTs). As will be explained later, our study of these topics 
will be limited to finite-energy waveforms. Useful models for source and channel waveforms 
almost invariably fall into the finite-energy class. 


The Fourier series represents a waveform, either periodic or time-limited, as a weighted sum of 
sinusoids. Each weight (coefficient) in the sum is determined by the function, and the function 
is essentially determined by the sequence of weights. Thus the function and the sequence of 
weights are essentially equivalent representations. 


Our interest here is almost exclusively in time-limited rather than periodic waveforms?. Initially 
the waveforms are assumed to be time-limited to some interval —T/2 < t < T/2 of an arbitrary 
duration T > 0 around 0. This is then generalized to time-limited waveforms centered at some 
arbitrary time. Finally, an arbitrary waveform is segmented into equal length segments each of 
duration T; each such segment is then represented by a Fourier series. This is closely related 
to modern voice-compression techniques where voice waveforms are segmented into 20 msec 
intervals, each of which are separately expanded into a Fourier-like series. 

Consider a complex function {u(t) : R — C} that is nonzero only for —T/2 < t < T/2 (i.e, 
u(t) = 0 for t < —T/2 and t > T/2). Such a function is frequently indicated by {u(t) : 
[-T/2,T/2] — C}. The Fourier series for such a time-limited function is given by® 


ae ts ee e2mikt/T for —-T/2<t<T/2 
ul) =) 6 


elsewhere, (a1) 


where i denotes’ \/—1. The Fourier series coefficients &, are in general complex (even if u(t) is 
real), and are given by 


1 ft? ; 

Up = z/ u(tje 27/7 ae, —o0 <k < oo. (4.2) 
LT J_rj2 

The standard rectangular function, 


(fl for-1/2<t<1/2 
rect(t) _ { 0 elsewhere, 


can be used to simplify (4.1) as follows: 
= t 
22 ~ j2nikt/T Lae 4, 
u(t) S> tp € rect(=) (4.3) 
k=—0co 
This expresses u(t) as a linear combination of truncated complex sinusoids, 


. t 
u(t) = Y 7 MxOe(t) where O4(t) = e**/Prect( =). (4.4) 
keZ 


5 Periodic waveforms are not very interesting for carrying information; after observing one period, the rest of 
the waveform carries nothing new. 

°The conditions and the sense in which (4.1) holds are discussed later. 

"The use of i for /—1 is standard in all scientific fields except electrical engineering. Electrical engineers 
formerly reserved the symbol i for electrical current and thus often use j to denote //—1. 
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Assuming that (4.4) holds for some set of coefficients {t,; k € Z}, the following simple and 
instructive argument shows why (4.2) is satisifed for that set of coefficients. Two complex 
waveforms, 0,(t) and @,(t), are defined to be orthogonal if [°. O(t)6%,(t) dt = 0. The truncated 
complex sinusoids in (4.4) are orthogonal since the interval [-T'/2,7//2] contains an integral 
number of cycles of each, i.e., for k 4m € Z, 


ore) T/2 ; 
/ 0, (t)0%,(t) dt = / e2mi(k—m)t/T gp = 9, 
Le _T/2 


Thus the right side of (4.2) can be evaluated as 


1 T/2 ; 1 co 6 
= if u(t)e 27/7 qe = a ih S- tin Om (t)O z(t) at 


-T/2 WBS 
= Uk re 2 
= Ff leatoyae 
a ATS 
Uk A 
= — dt =  &r. 4.5 
T J_r/2 ; ve) 


An expansion such as that of (4.4) is called an orthogonal expansion. As shown later, the 
argument in (4.5) can be used to find the coefficients in any orthogonal expansion. At that 
point, more care will be taken in exchanging the order of integration and summation above. 


Example 4.2.1. This and the following example illustrate why (4.4) need not be valid for all 
values of t. Let u(t) = rect(2t) (see Figure 4.2). Consider representing u(t) by a Fourier series 
over the interval —1/2 <t < 1/2. As illustrated, the series can be shown to converge to u(t) at 
all t € [—-1/2,1/2] except for the discontinuities at t = +1/4. At t = +1/4, the series converges 
to the midpoint of the discontinuity and (4.4) is not valid® at t = +1/4. The next section will 
show how to state (4.4) precisely so as to avoid these convergence issues. 


TT CAN Ji. ml 


0 i i 0 i ae eee ae | ee 1 
2 2 2 2 4 4 2 
u(t) = rect (2t) 5 + 2 cos(27t) 5 + 2 cos(2zt) >, une?" rect (£) 


— = cos(6mt) 


oO 
Ie 
In 
| 

I) 


Figure 4.2: The Fourier series (over [—1/2, 1/2]) of a rectangular pulse. The second 
figure depicts a partial sum with & = —1,0,1 and the third figure depicts a partial sum 
with —3 < k < 3. The right figure illustrates that the series converges to u(t) except 
at the points t = +1/4, where it converges to 1/2. 


Example 4.2.2. As a variation of the previous example, let v(t) be 1 for 0 < t < 1/2 and 0 
elsewhere. Figure 4.3 shows the corresponding Fourier series over the interval —1/2 < t < 1/2. 


’Most engineers, including the author, would say ‘so what, who cares what the Fourier series converges to 
at a discontinuity of the waveform’. Unfortunately, this example is only the tip of an iceberg, especially when 
time-sampling of waveforms and sample waveforms of random processes are considered. 
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A peculiar feature of this example is the isolated discontinuity at t = —1/2, where the series 
converges to 1/2. This happens because the untruncated Fourier series, }77° .. bpe2™**, is 
periodic with period 1 and thus must have the same value at both t = —1/2 and t = 1/2. More 
generally, if an arbitrary function {v(t) : [-T'/2,T/2] — C} has v(—T/2) ¥ v(T/2), then its 


Fourier series over that interval cannot converge to u(t) at both those points. 


] @ @ 
} i 1 1 
0 =o 0 1 


0 


Nie 
Nie 
Nie 


2 
u(t) = rect (2t — 4) 5 + 2sin(2zt) Se age Oreck) 
Figure 4.3: The Fourier series over [—1/2, 1/2] of the same rectangular pulse shifted 
right by 1/4. The middle figure again depicts a partial expansion with k = —1,0,1. 
The right figure shows that the series converges to u(t) except at the points t = —1/2,0, 
and 1/2, at each of which it converges to 1/2. 


4.2.1 Finite-energy waveforms 

The energy in a real or complex waveform u(t) is defined? to be [™. |u(t)|? dt. The energy in 
source waveforms plays a major role in determining how well the waveforms can be compressed 
for a given level of distortion. As a preliminary explanation, consider the energy in a time-limited 
waveform {u(t) : [-T'/2, T/2] — R}. This energy is related to the Fourier series coefficients of 
u(t) by the following energy equation which is derived in Exercise 4.2 by the same argument 
used in (4.5): 


T/2 00 
‘ l(t) dt=T S> lag. (4.6) 


=-T/2 k=—00 

Suppose that u(t) is compressed by first generating its Fourier series coefficients, {ti,; k € Z} and 
then compressing those coefficients. Let {t,;k € Z} be this sequence of compressed coefficients. 
Using a squared distortion measure for the coefficients, the overall distortion is )>,, |G, — 6x|?. 
Suppose these compressed coefficients are now encoded, sent through a channel, reliably decoded, 
and converted back to a waveform v(t) = 37, ipe?""*/? as in Figure 4.1. The difference between 
the input waveform u(t) and the output v(t) is then u(t) — u(t), which has the Fourier series 
So, (tin — tp )e2™"*""/T. Substituting u(t) — v(t) into (4.6) results in the difference-energy equation, 


T/2 
if lu(t) — v(t) dt = TT ay — a. (4.7) 
t k 


=-T/2 


Thus the energy in the difference between u(t) and its reconstruction v(t) is simply T times 
the sum of the squared differences of the quantized coefficients. This means that reducing the 
squared difference in the quantization of a coefficient leads directly to reducing the energy in 
the waveform difference. The energy in the waveform difference is a common and reasonable 


°Note that u? = |u|? if u is real, but for complex u, u® can be negative or complex and |u|? = wu* = 


[R(u)]? + [S(u)]? is required to correspond to the intuitive notion of energy. 
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measure of distortion, but the fact that it is directly related to mean-squared coefficient distortion 
provides an important added reason for its widespread use. 


There must be at least T units of delay involved in finding the Fourier coefficients for u(t) in 
|— 1/2, T/2] and then reconstituting v(t) from the quantized coefficients at the receiver. There 
is additional processing and propagation delay in the channel. Thus the output waveform must 
be a delayed approximation to the input. All of this delay is accounted for by timing recovery 
processes at the receiver. This timing delay is set so that v(t) at the receiver, according to the 
receiver timing, is the appropriate approximation to u(t) at the transmitter, according to the 
transmitter timing. Timing recovery and delay are important problems, but they are largely 
separable from the problems of current interest. Thus, after recognizing that receiver timing is 
delayed from transmitter timing, delay can be otherwise ignored for now. 


Next, visualize the Fourier coefficients @; as sample values of independent random variables and 
visualize u(t), as given by (4.3), as a sample value of the corresponding random process (this will 
be explained carefully in Chapter 7). The expected energy in this random process is equal to T 
times the sum of the mean-squared values of the coefficients. Similarly the expected energy in 
the difference between u(t) and v(t) is equal to T times the sum of the mean-squared coefficient 
distortions. It was seen by scaling in Chapter 3 that the the mean-squared quantization error 
for an analog random variable is proportional to the variance of that random variable. It is thus 
not surprising that the expected energy in a random waveform will have a similar relation to 
the mean-squared distortion after compression. 


There is an obvious practical problem with compressing a finite-duration waveform by quantizing 
an infinite set of coefficients. One solution is equally obvious: compress only those coefficients 
with a significant mean-squared value. Since the expected value of )>,,|d|? is finite for finite- 
energy functions, the mean-squared distortion from ignoring small coefficients can be made as 
small as desired by choosing a sufficiently large finite set of coefficients. One then simply chooses 
oy = 0 in (4.7) for each ignored value of k. 


The above argument will be developed carefully after developing the required tools. For now, 
there are two important insights. First, the energy in a source waveform is an important param- 
eter in data compression, and second, the source waveforms of interest will have finite energy 
and can be compressed by compressing a finite number of coefficients. 


Next consider the waveforms used for channel transmission. The energy used over any finite 
interval T is limited both by regulatory agencies and by physical constraints on transmitters and 
antennas. One could consider waveforms of finite power but infinite duration and energy (such 
as the lowly sinusoid). On one hand, physical waveforms do not last forever (transmitters wear 
out or become obsolete), but on the other hand, models of physical waveforms can have infinite 
duration, modeling physical lifetimes that are much longer than any time scale of communication 
interest. Nonetheless, for reasons that will gradually unfold, the channel waveforms in this text 
will almost always be restricted to finite energy. 


There is another important reason for concentrating on finite-energy waveforms. Not only are 
they the appropriate models for source and channel waveforms, but they also have remarkably 
simple and general properties. These properties rely on an additional constraint called mea- 
surability which is explained in the following section. These finite-energy measurable functions 
are called £2 functions. When time-constrained, they always have Fourier series, and without 
a time constraint, they always have Fourier transforms. Perhaps the most important property, 
however, is that £2 functions can be treated essentially as conventional vectors (see Chapter 5). 
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One might question whether a limitation to finite-energy functions is too constraining. For 
example, a sinusoid is often used to model the carrier in passband communication, and sinusoids 
have infinite energy because of their infinite duration. As seen later, however, when a finite- 
energy baseband waveform is modulated by that sinusoid up to passband, the resulting passband 
waveform has finite energy. 


As another example, the unit impulse (the Dirac delta function 6(t)) is a generalized function 
used to model waveforms of unit area that are nonzero only in a narrow region around t = 0, 
narrow relative to all other intervals of interest. The impulse response of a linear-time-invariant 
filter is, of course, the response to a unit impulse; this response approximates the response to 
a physical waveform that is sufficiently narrow and has unit area. The energy in that physical 
waveform, however, grows wildly as the waveform becomes more narrow. A rectangular pulse 
of width ¢ and height 1/e, for example, has unit area for all ¢ > 0 but has energy 1/e, which 
approaches co as € — 0. One could view the energy in a unit impulse as being either undefined 
or infinite, but in no way could view it as being finite. 


To summarize, there are many useful waveforms outside the finite-energy class. Although they 
are not physical waveforms, they are useful models of physical waveforms where energy is not 
important. Energy is such an important aspect of source and channel waveforms, however, that 
such waveforms can safely be limited to the finite-energy class. 


4.3 L5 functions and Lebesgue integration over [—T/2, T/2| 


A function {u(t) : R — C} is defined to be Lo if it is Lebesgue measurable and has a finite 
Lebesgue integral [°° |u(t)|? dt. This section provides a basic and intuitive understanding of 
what these terms mean. The appendix provides proofs of the results, additional examples, and 
more depth of understanding. Still deeper understanding requires a good mathematics course 
in real and complex variables. The appendix is not required for basic engineering understanding 
of results in this and subsequent chapters, but it will provide deeper insight. 


The basic idea of Lebesgue integration is no more complicated than the more common Rie- 
mann integration taught in freshman college courses. Whenever the Riemann integral exists, 
the Lebesgue integral also exists! and has the same value. Thus all the familiar ways of calcu- 
lating integrals, including tables and numerical procedures, hold without change. The Lebesgue 
integral is more useful here, partly because it applies to a wider set of functions, but, more 
importantly, because it greatly simplifies the main results. 


This section considers only time-limited functions, {u(t) : [-T'/2,7/2] — C}. These are the 
functions of interest for Fourier series, and the restriction to a finite interval avoids some math- 
ematical details better addressed later. 


Figure 4.4 shows intuitively how Lebesgue and Riemann integration differ. Conventional Rie- 
mann integration of a nonnegative real-valued function u(t) over an interval [—T'/2, T’/2] is 
conceptually performed in Figure 4.4a by partitioning [—T/2, T/2] into, say, io intervals each 
of width T/ig. The function is then approximated within the ith such interval by a single 
value u;, such as the mid-point of values in the interval. The integral is then approximated as 
S~i°,(T/io)u;. If the function is sufficiently smooth, then this approximation has a limit, called 
the Riemann integral, as ig — oo. 


There is a slight notional qualification to this which is discussed in the sinc function example of Section 4.5.1. 
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U3 36 tit t3 t4 
U2 ug Bs 2 = (t2 — t1) + (ta — ta) 
OE u10 5 wu = (tr + 2) + (4 - ta) 
Lo = 0 
-T/2 T/2 -T/2 T/2 
T/2 T/2 
ey u(t) dt = S72, ui/to (ee u(t) dt © 50, md pm 
(a): Riemann (b): Lebesgue 


Figure 4.4: Example of Riemann and Lebesgue integration 


To integrate the same function by Lebesgue integration, the vertical axis is partitioned into 
intervals each of height 6, as shown in Figure 4.4(b). For the mth such interval,!! [md, (m+1)6), 
let E;, be the set of values of t such that md < u(t) < (m+1)d. For example, the set €2 is 
illustrated by arrows in Figure 4.4 and is given by 


&2= {t :26< u(t) < 306} = [t1, tz) U (ts, ta]. 
As explained below, if €,, is a finite union of separated!? intervals, its measure, si, is the sum 
of the widths of those intervals; thus jz2 in the example above is given by 
bz = w(E2) = (t2 — tr) + (ta — ts). (4.8) 


Similarly, €; = [+ t1) li (tas 7] and pa = (ti + 5) a (4 — t4). 
The Lebesque integral is approximated as }7,,,(m0)tim. This approximation is indicated by the 
vertically shaded area in the figure. The Lebesgue integral is essentially the limit as 6 > 0. 


In short, the Riemann approximation to the area under a curve splits the horizontal axis into 
uniform segments and sums the corresponding rectangular areas. The Lebesgue approximation 
splits the vertical axis into uniform segments and sums the height times width measure for each 
segment. In both cases, a limiting operation is required to find the integral, and Section 4.3.3 
gives an example where the limit exists in the Lebesgue but not the Riemann case. 


4.3.1 Lebesgue measure for a union of intervals 


In order to explain Lebesgue integration further, measure must be defined for a more general 
class of sets. 


The measure of an interval I from a to b, a < b is defined to be u(L) = b— a > 0. For any finite 
union of, say, £ separated intervals, E = Cia I;, the measure p(€) is defined as 


wlE) = Sw). (4.9) 


"The notation [a, b) denotes the semiclosed interval a < t < b. Similarly, (a, b] denotes the semiclosed interval 
a<t< 6, (a, b) the open interval a < t < b, and [a, 6] the closed interval a < t < b. In the special case where 
a = b, the interval [a, a] consists of the single point a, whereas [a, a), (a, a], and (a,a) are empty. 

Two intervals are separated if they are both nonempty and there is at least one point between them that lies 
in neither interval; i.e., (0,1) and (1,2) are separated. In contrast, two sets are disjoint if they have no points in 
common. Thus (0,1) and [1,2] are disjoint but not separated. 
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This definition of ~(€) was used in (4.8) and is necessary for the approximation in Figure 4.4b to 
correspond to the area under the approximating curve. The fact that the measure of an interval 
does not depend on inclusion of the end points corresponds to the basic notion of area under a 
curve. Finally, since these separated intervals are all contained in [—T'/2, T/2], it is seen that 
the sum of their widths is at most T, i.e., 


0<p(€) <T. (4.10) 


Any finite union of, say, @ arbitrary intervals, € = Wha I;, can also be uniquely expressed as 
a finite union of at most ¢ separated intervals, say Ij,... , Ij, k < ¢ (see Exercise 4.5), and its 


measure is then given by 
w(E) = Soul). (4.11) 


The union of a countably infinite collection!’ of separated intervals, say B = Lea I; is also 
defined to be measurable and has a measure given by 


u(B) = lim S° wT). (4.12) 


The summation on the right is bounded between 0 and T for each @. Since (Jj) > 0, the sum is 
nondecreasing in £. Thus the limit exists and lies between 0 and JT. Also the limit is independent 
of the ordering of the I; (see Exercise 4.4). 


Example 4.3.1. Let I; = ([2~*/,T2-%/*1) for all integer j > 1. The jth interval then has 
measure ju(I;) = 2~%7. These intervals get smaller and closer to 0 as j increases. They are 
easily seen to be separated. The union B = LU); J; then has measure p(B) = as) To el PS 
Visualize replacing the function in Figure 4.4 by one that oscillates faster and faster as t — 0; 
B could then represent the set of points on the horizontal axis corresponding to a given vertical 
slice. 


Example 4.3.2. As a variation of the above example, suppose 6 = lJ F I; where I; = 
[T2-*),T2-*)| for each j. Then interval I; consists of the single point T2~7 so u(J;) = 0. 
In this case, YS u(1;) = 0 for each £. The limit of this as £ — oo is also 0, so u(B) = 0 in this 
case. By the same argument, the measure of any countably infinite set of points is 0. 


Any countably infinite union of arbitrary (perhaps intersecting) intervals can be uniquely! 
represented as a countable (i.e., either a countably infinite or finite) union of separated intervals 
(see Exercise 4.6); its measure is defined by applying (4.12) to that representation. 


4.3.2 Measure for more general sets 


It might appear that the class of countable unions of intervals is broad enough to represent any 
set of interest, but it turns out to be too narrow to allow the general kinds of statements that 


'3 An elementary discussion of countability is given in Appendix 4A.1. Readers unfamiliar with ideas such as 
the countability of the rational numbers are strongly encouraged to read this appendix. 
4The collection of separated intervals and the limit in (4.12) is unique, but the ordering of the intervals is not. 
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formed our motivation for discussing Lebesgue integration. One vital generalization is to require 
that the complement B (relative to [-T'/2, T/2]) of any measurable set B also be measurable.' 
Since p([—-T'/2,7'/2]) = T and every point of [—T'/2, 7/2] lies in either B or B but not both, the 
measure of B should be T — p(B). The reason why this property is necessary in order for the 
Lebesgue integral to correspond to the area under a curve is illustrated in Figure 4.5. 


~> +> <+—— B 
y(t) 
B 
—T/2 T/2 
Figure 4.5: Let f(t) have the value 1 on a set 6 and the value 0 elsewhere in [—T'/2, T'/2]. 
Then f f(t) dt = «(B). The complement B of B is also illustrated and it is seen that 


1 — f(t) is 1 on the set B and 0 elsewhere. Thus {[1 — f(¢)] dt = (B), which must 
equal T — y(B) for integration to correspond to the area under a curve. 


The subset inequality is another property that measure should have: this states that if A and 
B are both measurable and A C B, then pu(A) < p(B). One can also visualize from Figure 4.5 
why this subset inequality is necessary for integration to represent the area under a curve. 


Before defining which sets in [-T'/2, T/2] are measurable and which are not, a measure-like 
function called outer measure is introduced that exists for all sets in [-T/2, 7/2]. For an 
arbitrary set A, the set 6 is said to cover Aif.A C B and B is a countable union of intervals. The 
outer measure p1°(A) is then essentially the measure of the smallest cover of A. In particular, 


y°(A) = y(B). (4.13) 


inf 
B: Bcovers A 
Not surprisingly, the outer measure of a countable union of intervals is equal to its measure as 
already defined (see Appendix 4A.3). 


Measurable sets and measure over the interval [—T'/2, T/2] can now be defined as follows: 


Definition: A set A (over [—T/2, T’/2]) is measurable if u°(A)+°(A) = T. If A is measurable, 
then its measure, (A), is the outer measure 1.°(A). 

Intuitively, then, a set is measurable if the set and its complement are sufficiently untangled 
that each can be covered by countable unions of intervals which have arbitrarily little overlap. 
The example at the end of Section 4A.4 constructs the simplest nonmeasurable set we are aware 
of; it should be noted how bizarre it is and how tangled it is with its complement. 


The definition of measurability is a ‘mathematician’s definition’ in the sense that it is very 


'® Appendix 4A.1 uses the set of rationals in [-T/2,T/2] to illustrate that the complement B of a countable 
union of intervals B need not be a countable union of intervals itself. In this case u(B) = T — u(B), which is 
shown to be valid also when B is a countable union of intervals. 

'6The infimum (inf) of a set of real numbers is essentially the minimum of that set. The difference between the 
minimum and the infimum can be seen in the example of the set of real numbers strictly greater than 1. This set 
has no minimum, since for each number in the set, there is a smaller number still greater than 1. To avoid this 
somewhat technical issue, the infimum is defined as the greatest lower bound of a set. In the example, all numbers 
less than or equal to 1 are lower bounds for the set, and 1 is then greatest lower bound, i.e., the infimum. Every 
nonempty set of real numbers has an infimum if one includes —oo as a choice. 
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succinct and elegant, but doesn’t provide many immediate clues about determining whether a 
set is measurable and, if so, what its measure is. This is now briefly discussd. 


It is shown in Appendix 4A.3 that countable unions of intervals are measurable according to 
this definition, and the measure can be found by breaking the set into separated intervals. Also, 
by definition, the complement of every measurable set is also measurable, so the complements of 
countable unions of intervals are measurable. Next, if A C A’, then any cover of A’ also covers 
A so the subset inequality is satisfied. This often makes it possible to find the measure of a set 
by using a limiting process on a sequence of measurable sets contained in or containing a set of 
interest. Finally, the following theorem is proven in Section 4A.4 of the appendix. 


Theorem 4.3.1. Let A,,Ao,..., be any sequence of measurable sets. Then S = Uj A; and 
D = (V1 Aj are measurable. If Ai,A2,... are also disjoint, then w(S) = >), (Aj). Ff 
p°(A) = 0, then A is measurable and has zero measure. 


This theorem and definition say that the collection of measurable sets is closed under countable 
unions, countable intersections, and complementation. This partly explains why it is so hard to 
find nonmeasurable sets and also why their existence can usually be ignored - they simply don’t 
arise in the ordinary process of analysis. 


Another consequence concerns sets of zero measure. It was shown earlier that any set containing 
only countably many points has zero measure, but there are many other sets of zero measure. 
The Cantor set example in Section 4A.4 illustrates a set of zero measure with uncountably many 
elements. The theorem implies that a set A has zero measure if, for any « > 0, A has a cover 
B such that (B) < ©. The definition of measurability shows that the complement of any set of 
zero measure has measure T, i.e., [-T'/2, T/2] is the cover of smallest measure. It will be seen 
shortly that for most purposes, including integration, sets of zero measure can be ignored and 
sets of measure T can be viewed as the entire interval [—-T'/2, T/2]. 


This concludes our study of measurable sets on [—T/2,T/2]. The bottom line is that not 
all sets are measurable, but that non-measurable sets arise only from bizarre and artificial 
constructions and can usually be ignored. The definitions of measure and measurability might 
appear somewhat arbitrary, but in fact they arise simply through the natural requirement that 
intervals and countable unions of intervals be measurable with the given measure!” and that 
the subset inequality and complement property be satisfied. If we wanted additional sets to be 
measurable, then at least one of the above properties would have to be sacrificed and integration 
itself would become bizarre. The major result here, beyond basic familiarity and intuition, is 
Theorem 4.3.1 which is used repeatedly in the following sections. The appendix fills in many 
important details and proves the results here 


4.3.3. Measurable functions and integration over |—T'/2, T’/2| 


A function {u(t) : [-T'/2, T/2] — R}, is said to be Lebesgue measurable (or more briefly mea- 
surable) if the set of points {t : u(t) < G} is measurable for each @ € R. If u(t) is measurable, 
then, as shown in Exercise 4.11, the sets {t : u(t) < G}, {t: u(t) > GB}, {t: u(t) > B} and 
{t : @ < u(t) < @} are measurable for alla < @ € R. Thus, if a function is measurable, the 


'7We have not distinguished between the condition of being measurable and the actual measure assigned a set, 
which is natural for ordinary integration. The theory can be trivially generalized, however, to random variables 
restricted to [—T'/2, T’/2]. In this case, the measure of an interval is redefined to be the probability of that interval. 
Everything else remains the same except that some individual points might have non-zero probability. 
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measure [yn = p({t: md < u(t) < (m+1)d}) associated with the mth horizontal slice in Figure 
4.4 must exist for each 6 > 0 and m. 


For the Lebesgue integral to exist, it is also necessary that the Figure 4.4 approximation to 
the Lebesgue integral has a limit as the vertical interval size 6 goes to 0. Initially consider 
only nonnegative functions, u(t) > 0 for all t. For each integer n > 1, define the nth order 
approximation to the Lebesgue integal as that arising from partitioning the vertical axis into 
intervals each of height 6, = 27". Thus a unit increase in n corresponds to halving the vertical 
interval size as illustrated below. 


-T/2 T/2 


Figure 4.6: The improvement in the approximation to the Lebesgue integral by a unit 
increase in n is indicated by the horizontal crosshatching. 


Let fmm be the measure of {t : m2~" < u(t) < (m+ 1)2~"}, ae., the measure of the set of 
t € [-T/2, T/2] for which u(t) is in the mth vertical interval for the nth order approximation. 
The approximation )*,,,m27” fim. might be infinite’ for all n, and in this case the Lebesgue 
integral is said to be infinite. If the sum is finite for n = 1, however, the figure shows that the 
change in going from the approximation of order n to n+ 1 is nonnegative and upper bounded 
by T2-"-!. Thus it is clear that the sequence of approximations has a finite limit which is 
defined!’ to be the Lebesgue integral of u(t). In summary, the Lebesgue integral of an arbitrary 
measurable nonnegative function {u(t) : [-T/2, T/2] — R} is finite if any approximation is 
finite and is then given by 


(oe) 
fu = im S> iy aaak ee where pmn = w(t: m2~” < u(t) < (m+1)2™). (4.14) 
m=0 

Example 4.3.3. Consider a function that has the value 1 for each rational number in 
[—T'/2,T/2] and 0 for all irrational numbers. The set of rationals has zero measure, as shown 
in Appendix 4A.1, so that each of the above approximations to the Lebesgue integral are 0 and 
thus the limit is zero. This is a simple example of a function that has a Lebesgue integral but 
no Riemann integral. 


Next consider two non-negative measurable functions u(t) and v(t) on [—T'/2, T/2] and assume 
u(t) = v(t) except on a set of zero measure. Then each of the approximations in (4.14) are 
identical for u(t) and v(t), and thus the two integrals are identical (either both infinite or both 
the same number). This same property will be seen to carry over for functions that also take on 
negative values and for complex valued functions. This property says that sets of zero measure 


'8For example, this sum is infinite if u(t) = 1/|t| for -T’/2 < t < T/2. The situation here is essentially the 
same for Riemann and Lebesgue integration. 
This limiting operation can be shown to be independent of how the quantization intervals approach 0. 
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can be ignored in integration. This is one of the major simplifications afforded by Lebesgue 
integration. Two functions that are the same except on a set of zero measure are said to be 
equal almost everywhere, abbreviated a.e. For example, the rectangular pulse and its Fourier 
series representation illustrated in Figure 4.2 are equal a.e. 


For functions taking on both positive and negative values, the function u(t) can be separated 
into a positive part u(t) and a negative part u~(t). These are defined by 


_ f u(t) for t: u(t) >0 | a 0 for t: u(t) >0 
u"() = { 0 for t: u(t) <0 ’ Ue { —u(t) for t: u(t) <0. 


For all t € [-T'/2, T/2] then, 
u(t) = ut (t) — u(t). (4.15) 


If u(t) is measurable, then u+(t) and u(t) are also.2° Since these are nonnegative, they can be 
integrated as before, and each integral exists with either a finite or infinite value. If at most one 
of these integrals is infinite, the Lebesgue integral of u(t) is defined as 


fms forw- frwae (4.16) 


If both f ut(t) dt and f u(t) dt are infinite, then the integral is undefined. 


Finally, a complex function {u(t) : [-T'/2 7/2] — C} is defined to be measurable if the real and 
imaginary parts of u(t) are measurable. If the integrals of R(u(t)) and S(u(t)) are defined, then 
the Lebesgue integral [ u(t) dt is defined by 


fumac= fru ar+i faut) at (4.17) 


The integral is undefined otherwise. Note that this implies that any integration property of 
complex valued functions {u(t) : [-T/27T/2] — C} is also shared by real valued functions 
{u(t) : [-T/2T/2] — R}. 


4.3.4 Measurability of functions defined by other functions 


The definitions of measurable functions and Lebesgue integration in the last subsection were 
quite simple given the concept of measure. However, functions are often defined in terms of other 
more elementary functions, so the question arises whether measurability of those elementary 
functions implies that of the defined function. The bottom-line answer is almost invariably yes. 
For this reason it is often assumed in the following sections that all functions of interest are 
measurable. Several results are now given fortifying this bottom-line view. 

First, if {u(t) : [-T/2, T/2] — R} is measurable, then —u(t), |u(t)|, u2(t), e", and In |u(t)| are 
also measurable. These and similar results follow immediately from the definition of measurable 
functions and are derived in Exercise 4.12. 


Next, if u(t) and u(t) are measurable, then u(t) + v(t) and u(t)v(t) are measurable (see Exercise 
4.13). 


0To see this, note that for 3 > 0, {t: u*(t) < B} = {t: u(t) < GB}. For 6 <0, {t: u(t) < 6} is the empty 
set. A similar argument works for u7 (t). 
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Finally, if {u,(t) : [-T/2, 7/2] — R} is a measurable function for each integer k > 1, then 
inf;, u,(t) is measurable. This can be seen by noting that {t : infz[uz(t)] < a} = U,{t: ue(t) < 
a}, which is measurable for each a. Using this result, Exercise 4.15, shows that lim, u(t) is 
measurable if the limit exists for all t € [-T'/2, T/2]. 


4.3.5 L£, and £2 functions over [—T/2, T/2] 


A function {u(t) : [-T'/2, 1/2] — C} is said to be £1, or in the class £1, if u(t) is measurable 
and the Lebesgue integral of |u(t)| is finite.?! 


For the special case of a real function, {u(t) : [-T'/2, T/2] — R}, the magnitude |u(t)| can be 
expressed in terms of the positive and negative parts of u(t) as |u(t)| = wt(t) + u(t). Thus 
u(t) is £1 if and only if both ut(t) and w(t) have finite integrals. In other words, u(t) is £1 if 
and only if the Lebesgue integral of u(t) is defined and finite. 

For a complex function {u(t) : [—T'/2, T/2] — C}, it can be seen that u(t) is 2; if and only if 
both R[u(t)] and S{u(t)] are £1. Thus u(t) is £1 if and only if [ u(t) dt is defined and finite. 

A function {u(t) : [-T/2, T/2] — R} or {u(t) : [-T'/2, T/2] — C} is said to be an £2 function, 
or a finite-energy function, if u(t) is measurable and the Lebesgue integral of |u(t)|? is finite. 
All source and channel waveforms discussed in this text will be assumed to be £2. Although Lo 
functions are of primary interest here, the class of £; functions is of almost equal importance 
in understanding Fourier series and Fourier transforms. An important relation between £, and 
£o is given in the following simple theorem, illustrated in Figure 4.7. 


Theorem 4.3.2. If {u(t) : [—T/2, T/2] — C} is Lo, then it is also Ly. 
Proof: Note that |u(t)| < |u(t)|? for all t such that |u(t)| > 1. Thus Ju(t)| < Ju(t)|? + 1 for 


all t, so that f |u(t)|dt < f |u(t)|?dt+T. If the function u(t) is £2, then the right side of this 
equation is finite, so the function is also Ly. 


C £2 functions [—T/2,T/2] > C :) 
£, functions [—T/2,T/2] = C 


Measurable functions [—T/2,T/2] — C 


Figure 4.7: Illustration showing that for functions from [—T/2, T/2] to C, the class 
of £2 functions is contained in the class of £1 functions, which in turn is contained 
in the class of measurable functions. The restriction here to a finite domain such as 
[(—T'/2, T/2] is necessary, as seen later. 


This completes our basic introduction to measure and Lebesgue integration over the finite in- 
terval [—T'/2, T/2]. The fact that the class of measurable sets is closed under complementation, 
countable unions, and countable intersections underlies the results about the measurability of 


IC, functions are sometimes called integrable functions. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


102 CHAPTER 4. SOURCE AND CHANNEL WAVEFORMS 


functions being preserved over countable limits and sums. These in turn underlie the basic 
results about Fourier series, Fourier integrals, and orthogonal expansions. Some of those re- 
sults will be stated without proof, but an understanding of measurability will let us understand 
what those results mean. Finally, ignoring sets of zero measure will simplify almost everything 
involving integration. 


4.4 The Fourier series for £L. waveforms 


The most important results about Fourier series for £2 functions are as follows: 


Theorem 4.4.1 (Fourier series). Let {u(t) : |-T/2, T/2] — C} be an Le function. Then for 
each k € Z, the Lebesgue integral 


1 tl 
Lis = / u(t)e27*t/T ae (4.18) 
LT J_rye 


exists and satisfies |a,| < + f |u(t)| dt < oo. Furthermore, 


T/2 
lim dt = 0, (4.19) 


L 
u(t) = Lperree/? 
£00 -T/2 » 


where the limit is monotonic in €. Also, the energy equation (4.6) is satisfied. 


Conversely,, if {tin; k € Z} is a two-sided sequence of complex numbers satisfying \-p-. | tin |? < 
oo, then an Lo function {u(t) : [-T'/2,T/2] + C} exists such that (4.6) and (4.19) are satisfied. 
The first part of the theorem is simple. Since u(t) is measurable and e~27*#/T is measur- 
able for each k, the product u(t)e~?7"*“/7 is measurable. Also |u(t)e~?7*“/7| = |u(t)| so that 
ult)e  2rtkt/ T is £1 and the integral exists with the given upper bound (see Exercise 4.17). The 
rest of the proof is in the next chapter, Section 5.3.4. 


The integral in (4.19) is the energy in the difference between u(t) and the partial Fourier series 
using only the terms —¢ < k < @. Thus (4.19) asserts that u(t) can be approximated arbitrarily 
closely (in terms of difference energy) by finitely many terms in its Fourier series. 


A series is defined to converge in £2 if (4.19) holds. The notation l.i.m. (limit in mean-square) 
is used to denote Lz convergence, so (4.19) is often abbreviated by 


t 
u(t) = Lim. X ape. rect(=). (4.20) 


The notation does not indicate that the sum in (4.20) converges pointwise to u(t) at each t; for 
example, the Fourier series in Figure 4.2 converges to 1/2 rather than 1 at the values t = +1/4. 
In fact, any two £2 functions that are equal a.e. have the same Fourier series coefficients. 
Thus the best to be hoped for is that 5°) tix e2rikt/T rect(4) converges pointwise and yields a 
‘canonical representative’ for all the £2 functions that have the given set of Fourier coefficients, 
{ix; ke Z}. 

Unfortunately, there are some rather bizarre £2 functions (see the everywhere discontinu- 
ous example in Section 5A.1) for which >, ti e?7**/? rect(4) diverges for some values of t. 
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There is an important theorem due to Carleson [3], however, stating that if u(t) is £2, then 
yy, ty e27"*"/T rect(4) converges almost everywhere on [—T/2, T/2]. Thus for any £2 function 
u(t), with Fourier coefficients {ti : k € Z}, there is a well-defined function, 


= 4.21 
7 otherwise. ( ) 


u(t) { Sage rect (+) if the sum converges 
0 

Since the sum above converges a.e., the Fourier coefficients of u(t) given by (4.18) agree with 
those in (4.21). Thus w(t) can serve as a canonical representative for all the £2 functions with 
the same Fourier coefficients {tiz;k € Z}. From the difference-energy equation (4.7), it follows 
that the difference between any two £2 functions with the same Fourier coefficients has zero 
energy. Two £2 functions whose difference has zero energy are said to be L2 equivalent; thus all 
£5 functions with the same Fourier coefficients are £2 equivalent. Exercise 4.18 shows that two 
£5 functions are £2 equivalent if and only if they are equal almost everywhere. 


In summary, each £2 function {u(t) : [-T/2, T/2] — C} belongs to an equivalence class con- 
sisting of all Ly functions with the same set of Fourier coefficients. Each pair of functions in 
this equivalence class are £2 equivalent and equal a.e. The canonical representive in (4.21) is 
determined solely by the Fourier coefficients and is uniquely defined for any given set of Fourier 
coefficients satisfying 5°, \ai,|2 < 00; the corresponding equivalence class consists of the Lo 
functions that are equal to u(t) a.e. 


From an engineering standpoint, the sequence of ever closer approximations in (4.19) is usu- 
ally more relevant than the notion of an equivalence class of functions with the same Fourier 
coefficients. In fact, for physical waveforms, there is no physical test that can distinguish wave- 
forms that are £2 equivalent, since any such physical test requires an energy difference. At the 
same time, if functions {u(t) : [-T/2, T/2] — C} are consistently represented by their Fourier 
coefficients, then equivalence classes can usually be ignored. 


For all but the most bizarre £2 functions, the Fourier series converges everywhere to some 
function that is £2 equivalent to the original function, and thus, as with the points t = +1/4 
in the example of Figure 4.2, it is usually unimportant how one views the function at those 
isolated points. Occasionally, however, particularly when discussing sampling and vector spaces, 
the concept of equivalence classes becomes relevant. 


4.4.1 The T-spaced truncated sinusoid expansion 


There is nothing special about the choice of 0 as the center point of a time-limited function. For 
a function {v(t) : [A—T/2, A+ T/2] — C} centered around some arbitrary time A, the shifted 
Fourier series over that interval is?? 


. t—A 
w(t) .= Tian. 2? e2tkt/T rect (=) ; where (4.22) 
1 A+T/2 : 
%= = v(t)e27Rt/T de —o0 <k<o. (4.23) 
T Ja-t/2 


To see this, let u(t) = v(t + A). Then u(0) = v(A) and u(t) is centered around 0 and has a 
Fourier series given by (4.20) and (4.18). Letting d, = tipe~?7*4/" yields (4.22) and (4.23). 


?2Note that the Fourier relationship between the function u(t) and the sequence {vz} depends implicitly on the 
interval T and the shift A. 
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The results about measure and integration are not changed by this shift in the time axis. 


Next, suppose that some given function u(t) is either not time-limited or limited to some very 
large interval. An important method for source coding is first to break such a function into 
segments, say of duration T, and then to encode each segment?? separately. A segment can be 
encoded by expanding it in a Fourier series and then encoding the Fourier series coefficients. 


Most voice compression algorithms use such an approach, usually breaking the voice waveform 
into 20 msec segments. Voice compression algorithms typically use the detailed structure of 
voice rather than simply encoding the Fourier series coefficients, but the frequency structure of 
voice is certainly important in this process. Thus understanding the Fourier series approach is 
a good first step in understanding voice compression. 


The implementation of voice compression (as well as most signal processing techniques) usually 
starts with sampling at a much higher rate than the segment duration above. This sampling is 
followed by high-rate quantization of the samples, which are then processed digitally. Concep- 
tually, however, it is preferable to work directly with the waveform and with expansions such 
as the Fourier series. The analog parts of the resulting algorithms can then be implemented by 
the standard techniques of high-rate sampling and digital signal processing. 


Suppose that an £2 waveform {u(t) : R — C} is segmented into segments u,,(t) of duration T. 
Expressing u(t) as the sum of these segments,?+ 


u(t) = Lim. SS" um(t); where tm(t) = u(t) rect (+ _- m) P (4.24) 


Expanding each segment u,,(t) by the shifted Fourier series of (4.22) and (4.23): 


7 ae i 
ting (L). = Lia 2 than err tkt/T rect (= — m) ; where (4.25) 
1 mT+T/2 : 
thm = = ie oO at 
LT Jmt-T/2 
ey hae Qnikt/T t 
= 7 es u(t) eW27**#/T rect (= — m) dt. (4.26) 


Combining (4.24) and (4.25), 


t 
a(t) = lorie > Ss" teins erie ect (+ — m) : 


m ik 


This expands u(t) as a weighted sum?° of doubly indexed functions 


. t 
ae) = ars 2, d,BemBen(l where 4% m(t) = e27"**/T rect (+ — m) : (4.27) 


3 Any engineer, experienced or not, when asked to analyze a segment of a waveform, will automatically shift 
the time axis s that 0 is either the beginning or the center of the waveform. The added complication here simply 
arises from looking at multiple segments together so as to represent the entire waveform. 

?4This sum double-counts the points at the ends of the segments, but this makes no difference in terms of Lo 
convergence. Exercise 4.22 treats the convergence in (4.24) and (4.28) more carefully. 

?>Fxercise 4.21 shows why (4.27) (and similar later expressions) are independent of the order of the limits. 
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The functions 6; ,(t) are orthogonal, since, for m # m’, the functions 04 ,(¢) and O04) m/(t) do 
not overlap, and, for m =m! and k 4 k’, O0¢.m(t) and Ox m(t) are orthogonal as before. These 
functions, {9%,m(t);k,m © Z}, are called the T-spaced truncated sinusoids and the expansion in 
(4.27) is called the T-spaced truncated sinusoid expansion. 


The coefficients tiz,m are indexed by k,m € Z and thus form a countable set.2° This permits the 
conversion of an arbitrary £2 waveform into a countably infinite sequence of complex numbers, 
in the sense that the numbers can be found from the waveform, and the waveform can be 
reconstructed from the sequence, at least up to £2 equivalence. 


The Li.m. notation in (4.27) denotes £2 convergence; i.e., 


2 


n L 
ult)— YS" S° GikmOm(t)} dt = 0. (4.28) 


m=—n k=—L 


oo 
lim 
n,l—00 J _o9 


This shows that any given u(t) can be approximated arbitrarily closely by a finite set of co- 
efficients. In particular, each segment can be approximated by a finite set of coefficients, and 
a finite set of segments approximates the entire waveform (although the required number of 
segments and coefficients per segment clearly depend on the particular waveform). 


For data compression, a waveform u(t) represented by the coefficients {tig m;k,m € Z} can 
be compressed by quantizing each ti, into a representative 0%. The energy equation (4.6) 
and the difference-energy equation (4.7) generalize easily to the T-spaced truncated sinusoid 
expansion as 


ie Ord. a i SS Winl (4.29) 


TOO m=—0co k=—oo 


im u(t) -— v(t)? dt = TS” S  |finen — Oxmnl?. (4.30) 


ais k=—oo M=—0o 

As in Section 4.2.1, a finite set of coefficients should be chosen for compression and the remaining 
coefficients should be set to 0. The problem of compression (given this expansion) is then to 
decide how many coefficients to compress, and how many bits to use for each selected coefficient. 
This of course requires a probabilistic model for the coefficients; this issue is discussed later. 


There is a practical problem with the use of T-spaced truncated sinusoids as an expansion to be 
used in data compression. The boundaries of the segments usually act like step discontinuities (as 
in Figure 4.3) and this leads to slow convergence over the Fourier coefficients for each segment. 
These discontinuities could be removed prior to taking a Fourier series, but the current objective 
is simply to illustrate one general approach for converting arbitrary £2 waveforms to sequences 
of numbers. Before considering other expansions, it is important to look at Fourier transforms. 


4.5 Fourier transforms and Ly waveforms 


The T-spaced truncated sinusoid expansion corresponds closely to our physical notion of fre- 
quency. For example, musical notes correspond to particular frequencies (and their harmonics), 


?6Example 4A.2 in Section 4A.1 explains why the doubly indexed set above is countable. 
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but these notes persist for finite durations and then change to notes at other frequencies. How- 
ever, the parameter T' in the T-spaced expansion is arbitrary, and quantizing frequencies in 
increments of 1/T is awkward. 


The Fourier transform avoids the need for segmentation into T-spaced intervals, but also removes 
the capability of looking at frequencies that change in time. It maps a function of time, {u(t) : 
R — C} into a function of frequency,?” {&(f) : R — C}. The inverse Fourier transform maps 
a(f) back into u(t), essentially making u(f) an alternative representation of u(t). 


The Fourier transform and its inverse are defined by 
Ae / ute 2" at, (4.31) 
Ce / ail fe2"ift af. (4.32) 


The time units are seconds and the frequency units Hertz (Hz), 7.e., cycles per second. 


For now we take the conventional engineering viewpoint that any respectable function u(t) has 
a Fourier transform ti(f) given by (4.31), and that u(t) can be retrieved from ii(f) by (4.32). 
This will shortly be done more carefully for £2 waveforms. 

The following table reviews a few standard Fourier transform relations. In the table, u(t) and 
a(f) denote a Fourier transform pair, written u(t) @ a&(f) and similarly u(t) < o(f). 


au(t) + b(t) <— at(f) + bd(f) linearity (4.33) 

u‘(-t) @— a*(f) conjugation (4.34) 

u(t) «— u(-f) time/frequency duality (4.35) 

u(t—T) — e 2™Fra(f) time shift (4.36) 

u(t) efor 65 (f — fo) frequency shift (4.37) 

u(t/T) — Tu(fT) scaling (for T > 0) (4.38) 

du(t)/dt <= 27ifia(f) differentiation (4.39) 

[- u(r)o(t—Tt)dr — a(f)d(f) convolution (4.40) 
/ * ultje'(r—tdr oo a(fa*(f) Usalation (4.41) 


These relations will be used extensively in what follows. Time-frequency duality is particularly 
important, since it permits the translation of results about Fourier transforms to inverse Fourier 
transforms and vice versa. 


Exercise 4.23 reviews the convolution relation (4.40). Equation (4.41) results from conjugating 
o(f) in (4.40). 


Two useful special cases of any Fourier transform pair are: 


uo) =f atsyar (4.42) 
a0) = ih ” aie) ae (4.43) 


2"The notation é(f), rather the more usual U(f), is used here since capitalization is used to distinguish random 
variables from sample values. Later, {U(t) : R — C}will be used to denote a random process, where, for each t¢, 
U(t) is a random variable. 
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These are useful in checking multiplicative constants. Also Parseval’s theorem results from 
applying (4.42) to (4.41): 


/ u(t)v*(t) dt = i. au f)O*(f) df. (4.44) 
As a corollary, replacing v(t) by u(t) in (4.44) results in the energy equation for Fourier trans- 
forms, namely 


f moras [jane (4.45) 


—cC 
The magnitude squared of the frequency function, |t(f)|?, is called the spectral density of u(t). 
It is the energy per unit frequency (for positive and negative frequencies) in the waveform. 
The energy equation then says that energy can be calculated by integrating over either time or 
frequency. 


As another corollary of (4.44), note that if u(t) and u(t) are orthogonal, then u(f) and o(f) are 
orthogonal; 7.e., 


i u(t)u"(t) dt =0 if andonly if | a(f)o"(f) df =0. (4.46) 
The following table gives a short set of useful and familiar transform pairs: 
; _ sin(7t) fsa ator Fe 
sinc(t) = a + rect(f) = { 0 for |f|>1/2 (4.47) 
one es) ean (4.48) 
1 
—at. 45 f 4.4 
e“;t>0 ay onif ora > 0 (4.49) 
2 
een) sey = fora > 0 (4.50) 


a? + (2nif)? 


The above table, in conjunction with the relations above, yields a large set of transform pairs. 
Much more extensive tables are widely available. 


4.5.1 Measure and integration over R 


A set A C R is defined to be measurable if AN |—T/2, T/2] is measurable for all T > 0. The 
definitions of measurability and measure in section 4.3.2 were given in terms of an overall interval 
[-T'/2, T/2], but Exercise 4.14 verifies that those definitions are in fact independent of T. That 
is, if D C [-T/2, T/2], is measurable relative to [-T'/2, T/2], then D is measurable relative to 
[(-T1 /2, T,/2] for each T; > T and pu(D) is the same relative to each of those intervals. Thus 
measure is defined unambiguously for all sets of bounded duration. 


For an arbitrary measurable set A € R, the measure of A is defined to be 
p(A) = jim w(AN [-T/2, T/2]). (4.51) 


Since AN [—T/2, T/2] is increasing in T,, the subset inequality says that (AM [—-T/2, T/2]) is 
also increasing, so the limit in (4.51) must exist as either a finite or infinite value. For example, 
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if A is taken to be R itself, then (RN [—T/2, T/2]) = T and u(R) = oo. The possibility 
for measurable sets to have infinite measure is the primary difference between measure over 
[-T/2, T/2] and R.*8 

Theorem 4.3.1 carries over without change to sets defined over R. Thus the collection of measur- 
able sets over R is closed under countable unions and intersections. The measure of a measurable 
set might be infinite in this case, and if a set has finite measure, then its complement (over R) 
must have infinite measure. 


A real function {u(t) : R — R} is measurable if the set {t : u(t) < @} is measurable for each 
BER. Equivalently, {u(t) : R — R} is measurable if and only if u(t)rect(t/T) is measurable for 
all T > 0. A complex function {u(t) : R — C} is measurable if the real and imaginary parts of 
u(t) are measurable. 


If {u(t) : R — R} is measurable and nonnegative, there are two approaches to its Lebesgue 
integral. The first is to use (4.14) directly and the other is to first evaluate the integral over 
[-T/2, T/2] and then go to the limit T — oo. Both approaches give the same result.?9 


For measurable real functions {u(t) : R — R} that take on both positive and negative values, 
the same approach as in the finite duration case is successful. That is, let u*(t) and u7(t) be the 
positive and negative parts of u(t) respectively. If at most one of these has an infinite integral, 
the integral of u(t) is defined and has the value 


fuma= furiar— fur(at 


Finally, a complex function {u(t) : R — C} is defined to be measurable if the real and imaginary 
parts of u(t) are measurable. If the integral of R(u(t)) and that of S(u(t)) are defined, then 


fumac= fru asi fou(e) at (4.52) 


A function {u(t) : R — C} is said to be in the class £; if u(t) is measurable and the Lebesgue 
integral of |u(t)| is finite. As with integration over a finite interval, an £; function has real and 
imaginary parts both of which are £;. Also the positive and negative parts of those real and 
imaginary parts have finite integrals. 


Example 4.5.1. The sinc function, sinc(t) = sin(t)/zt is sketched below and provides an 
interesting example of these definitions. Since sinc(t) approaches 0 with increasing t only as 1/t, 
the Riemann integral of |sinc(t)| is infinite, and with a little thought it can be seen that the 
Lebesgue integral is also infinite. Thus sinc(t) is not an £, function. In a similar way, sinc* (t) 
and sinc” (t) have infinite integrals and thus the Lebesgue integral of sinc(t) over (—co, co) is 
undefined. 


The Riemann integral in this case is said to be improper, but can still be calculated by integrating 
from —A to +A and then taking the limit A — oo. The result of this integration is 1, which 
is most easily found through the Fourier relationship (4.47) combined with (4.43). Thus, in 
a sense, the sinc function is an example where the Riemann integral exists but the Lebesgue 
integral does not. In a deeper sense, however, the issue is simply one of definitions; one can 


8Tn fact, it was the restriction to finite measure that permitted the simple definition of measurability in terms 
of sets and their complements in Subsection 4.3.2. 

29 As explained shortly in the sinc function example, this is not necessarily true for functions taking on positive 
and negative values. 
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Figure 4.8: The function sinc(t) goes to 0 as 1/t with increasing t 


always use Lebesgue integration over [—A, A] and go to the limit A — oo, getting the same 
answer as the Riemann integral provides. 


A function {u(t) : R — C} is said to be in the class £2 if u(t) is measurable and the Lebesgue 
integral of |u(t)|? is finite. All source and channel waveforms will be assumed to be La. As 
pointed out earlier, any £2 function of finite duration is also £;. £2 functions of infinite duration, 
however, need not be £1; the sinc function is a good example. Since sinc(t) decays as 1/t, it is 
not £1. However, |sinc(t)|? decays as 1/t? as t — 00, so the integral is finite and sinc(t) is an 
£L» function. 


In summary, measure and integration over R can be treated in essentially the same way as over 
[-T/2, T/2]. The point sets and functions of interest can be truncated to [—T'/2, T/2] with a 
subsequent passage to the limit T — oo. As will be seen, however, this requires some care with 
functions that are not £1. 


4.5.2 Fourier transforms of £2 functions 


The Fourier transform does not exist for all functions, and when the Fourier transform does exist, 
there is not necessarily an inverse Fourier transform. This section first discusses £; functions 
and then £2 functions. A major result is that £; functions always have well-defined Fourier 
transforms, but the inverse transform does not always have very nice properties. £2 functions 
also always have Fourier transforms, but only in the sense of £2 equivalence. Here however, the 
inverse transform also exists in the sense of £2 equivalence. We are primarily interested in Lo 
functions, but the results about £; functions will help in understanding the £2 transform. 
Lemma 4.5.1. Let {u(t) : R + C} be Ly. Then a(f) = f° u(t)e?""!* dt both exists and 
satisfies |a(f)| < f|u(t)|dt for each f € R. Furthermore, {i(f) : R — C} is a continuous 
function of f. 


Proof: Note that |u(t)e~?"’/*| = |u(t)| for all real t and f. Thus u(t)e~?"/* is £ for each f 
and the integral exists and satisfies the given bound. This is the same as the argument about 
Fourier series coefficients in Theorem 4.4.1. The continuity follows from a simple €/d argument 
(see Exercise 4.24). 


As an example, the function u(t) = rect(t) is £; and its Fourier transform, defined at each f, is 
the continuous function sinc(f). As discussed before, sinc(f) is not £1. The inverse transform 
of sinc(f) exists at all t, equaling rect(t) except at t = +1/2, where it has the value 1/2. Lemma 
4.5.1 also applies to inverse transforms and verifies that sinc(f) can not be £1, since its inverse 
transform is discontinuous. 
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Next consider £2 functions. It will be seen that the pointwise Fourier transform f u(t)e~2"" dt 
does not necessarily exist at each f, but that it does exist as an £2 limit. In exchange for this 
added complexity, however, the inverse transform exists in exactly the same sense. This result is 
called Plancherel’s theorem and has a nice interpretation in terms of approximations over finite 
time and frequency intervals. 


For any £2 function {u(t) : R — C} and any positive number A, define ti4(f) as the Fourier 
transform of the truncation of u(t) to [—A, A]; i.e., 


A 
nCh= , uel de (4.53) 


The function u(t)rect(47) has finite duration and is thus £;. It follows that &4(f) is continuous 
and exists for all f by the above lemma. One would normally expect to take the limit in (4.53) 
as A — co to get the Fourier transform ii(f), but this limit does not necessarily exist for each 
f. Plancherel’s theorem, however, asserts that this limit exists in the £2 sense. This theorem is 
proved in Section 5A.1. 


Theorem 4.5.1 (Plancherel, part 1). For any Lo function {u(t) : R > C}, an Lo function 
{u(f):R— C} exists satisfying both 


(oe) 


lim / |a(f)—@a(f)|? df =0 (4.54) 


A-00 J_o6 
and the energy equation, (4.45). 


This not only guarantees the existence of a Fourier transform (up to £2 equivalence), but also 
guarantees that it is arbitrarily closely approximated (in difference energy) by the continuous 
Fourier transforms of the truncated versions of u(t). Intuitively what is happening here is that 
£2 functions must have an arbitrarily large fraction of their energy within sufficiently large 
truncated limits; the part of the function outside of these limits cannot significantly affect the 
£ convergence of the Fourier transform. 


The inverse transform is treated very similarly. For any £2 function {i(f) : R — C} and any 
B, 0<B<oo, define 


B 
seta / (per af (4.55) 


As before, ug(t) is a continuous £2 function for all B, 0<B<oo. The final part of Plancherel’s 
theorem is then: 


Theorem 4.5.2 (Plancherel, part 2). For any Lo function {u(t) : R > C} let {a(f) :R—- 
C} be the Fourier transform of Theorem 4.5.1 and let up(t) satisfy (4.55). Then 


(oe) 


lim \u(t) — up(t)/? dt = 0. (4.56) 


Boo 66) 


The interpretation is similar to the first part of the theorem. Specifically the inverse transforms 
of finite frequency truncations of the transform are continuous and converge to an £2 limit as 
B-—o. It also says that this £2 limit is equivalent to the original function u(t). 
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Using the limit in mean-square notation, both parts of the Plancherel theorem can be expressed 
by stating that every £2 function u(t) has a Fourier transform ii(f) satisfying 


. B 
a(f)= Lim. fo u(je?ta;  u(t)= lim. [ane af, 
Aco J-A Bee Jap 


i.e., the inverse Fourier transform of ti(f) is £2 equivalent to u(t). The first integral above 
converges pointwise if u(t) is also £1, and in this case converges pointwise to a continuous 
function ti(f). If u(t) is not £1, then the first integral need not converge pointwise. The second 
integral behaves in the analogous way. 


It may help in understanding the Plancherel theorem to interpret it in terms of finding Fourier 
transforms using Riemann integration. Riemann integration over an infinite region is defined as 
a limit over finite regions. Thus the Riemann version of the Fourier transform is shorthand for 


A 
a(f) = lim u(t)e ?"™* dt = lim t4(f). (4.57) 
A-oo J_ A A—oo 
Thus the Plancherel theorem can be viewed as replacing the Riemann integral with a Lebesgue 
integral and replacing the pointwise limit (if it exists) in (4.57) with £2 convergence. The Fourier 
transform over the finite limits —A to A is continuous and well-behaved, so the major difference 

comes in using £2 convergence as A — oo. 


As an example of the Plancherel theorem, let u(t) = rect(t). Then tia(f) = sinc(f) for all 
A > 1/2, so a(f) = sinc(f). For the inverse transform, ug(t) = rise sinc(f) df is messy to 
compute but can be seen to approach rect(t) as B — oo except at t = +1/2, where it equals 
1/2. At t = +1/2, the inverse transform is 1/2, whereas u(t) = 1. 


As another example, consider the function u(t) where u(t) = 1 for rational values of t € [0, 1] and 
u(t) = 0 otherwise. Since this is 0 a.e, the Fourier transform w(f) is 0 for all f and the inverse 
transform is 0, which is £2 equivalent to u(t). Finally, Example 5A.1 in Section 5A.1 illustrates 
a bizarre £; function g(t) that is everywhere discontinuous. Its transform g(f) is bounded 
and continuous by Lemma 4.5.1, but is not £1. The inverse transform is again discontinuous 
everywhere in (0,1) and unbounded over every subinterval. This example makes clear why the 
inverse transform of a continuous function of frequency might be bizarre, thus reinforcing our 
focus on £2 functions rather than a more conventional focus on notions such as continuity. 


In what follows, £2 convergence, as in the Plancherel theorem, will be seen as increasingly 
friendly and natural. Regarding two functions whose difference has 0 energy as being the same 
(formally, as £2 equivalent) allows us to avoid many trivialities, such as how to define a discon- 
tinuous function at its discontinuities. In this case, engineering common-sense and sophisticated 
mathematics arrive at the same conclusion. 


Finally, it can be shown that all the Fourier transform relations in (4.33) to (4.41) except 
differentiation hold for all £2 functions (see Exercises 4.26 and 5.15). The derivative of an Lo 
function need not be £2, and need not have a well-defined Fourier transform. 


4.6 The DTFT and the sampling theorem 


The discrete-time Fourier transform (DTFT) is the time/frequency dual of the Fourier series. 
It will be shown that the DTFT leads immediately to the sampling theorem. 
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4.6.1 The discrete-time Fourier transform 


Let i(f) be an £2 function of frequency, nonzero only for —W < f < W. The DTFT of a(f) 
over [—W, W] is then defined by 


a(f)=lim. » upe 27 F/M) rect, (4) ; (4.58) 
where the DTFT coefficients {u,;k € Z} are given by 


ee eer oC, 
ue = at fw u(fye df. (4.59) 


These are the same as the Fourier series equations, replacing t by f, T by 2W, and e?**” by 
e ?™~ Note that @(f) has an inverse Fourier transform u(t) which is thus baseband-limited to 
[—W, W]. As will be shown shortly, the sampling theorem relates the samples of this baseband 
waveform to the coefficients in (4.59). 

The Fourier series theorem (Theorem 4.4.1) clearly applies to (4.58)-(4.59) with the above no- 
tational changes; it is repeated here for convenience. 


Theorem 4.6.1 (DTFT). Let {u(f) : [—W,W — C} be an Lo function. Then for each k € Z, 
the Lebesgue integral (4.59) exists and satisfies |ux| < swf \a(f)|df < co. Furthermore, 
2 


L 
fi(f)— So upe?m*S/2M) af=0, and (4.60) 
k=-0 


Ww 
lim 
loo J_w 


WwW ioe) 
[la at =20 > Jul (4.61) 


k=—0o 


Finally, if {up, kE€Z} is a sequence of complex numbers satisfying S>, |ux|? < co, then an Leo 
function {ii(f) :|—W, W— C} exists satisfying (4.60) and (4.61). 


As before, (4.58) is shorthand for (4.60). Again, this says that any desired approximation 
accuracy, in terms of energy, can be achieved by using enough terms in the series. 


Both the Fourier series and the DTFT provide a one-to-one transformation (in the sense of Lo 
convergence) between a function and a sequence of complex numbers. In the case of the Fourier 
series, one usually starts with a function u(t) and uses the sequence of coefficients to represent 
the function (up to £2 equivalence). In the case of the DTFT, one often starts with the sequence 
and uses the frequency function to represent the sequence. Since the transformation goes both 
ways, however, one can view the function and the sequence as equally fundamental. 


4.6.2 The sampling theorem 


The DTFT is next used to establish the sampling theorem, which in turn will help interpret the 
DTFT. The DTFT (4.58) expresses ti(f) as a weighted sum of truncated sinusoids in frequency, 


af) =Jaam. d,webelf) where ¢,(f) = en PKSI rect), (4.62) 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


4.6. THE DTFT AND THE SAMPLING THEOREM 113 


Ignoring any questions of convergence for the time being, the inverse Fourier transform of t(f) 
is then given by u(t) = S>, urgs(t), where ¢,(t) is the inverse transform of ¢,(f). Since the 
inverse transform?” of rect (547) is 2Wsinc(2Wt), the time-shift relation implies that the inverse 
transform of ¢,(f) is 


y(t) = 2Wsinc(2Wt—k) ©  dy(f) =e 27*F/ rect), (4.63) 
Thus u(t), the inverse transform of t(f), is given by 


(oe) (oe) 


ot) = SS undp (t) = > 2Wuz sinc(2Wt — k). (4.64) 


k=—co k=—oo 


Since the set of truncated sinusoids {¢,;k € Z} are orthogonal, the sinc functions {¢,;k € Z} 
are also orthogonal from (4.46). 


sinc(t — 1) 


Figure 4.9: Sketch of sinc(t) = clus) and sinc(t — 1). Note that these spaced sinc 
functions are orthogonal to each other. 


Note that sinc(t) equals 1 for t = 0 and 0 for all other integer t. Thus if (4.64) is evaluated for 
t = sw, the result is that u(547) = 2Wu, for all integer k. Substituting this into (4.64) results 
in the equation known as the sampling equation, 


oe) 


u(t) = S> u( se) sime(2Wt —k). 


k=—0o 


This says that a baseband-limited function is specified by its samples at intervals T = 1/(2W). 
In terms of this sample interval, the sampling equation is 


(oe) 


u(t)= S> u(kT) sine( a = (4.65) 


k=—0co 


The following theorem makes this precise. See Section 5A.2 for an insightful proof. 


Theorem 4.6.2 (Sampling theorem). Let {u(t) : R — C} be a continuous Lo function 
baseband-limited to W. Then (4.65) specifies u(t) in terms of its T-spaced samples with 
T = sw. The sum in (4.65) converges to u(t) for each t € R and u(t) is bounded at each t 
by |u(t)| < f"wla(f)| df < oo. 


The following example illustrates why u(t) is assumed to be continuous above. 


2°This is the time/frequency dual of (4.47). a(f) = rect (34) is both £1 and Lo; u(t) is continuous and £2 but 
not £;. From the Plancherel theorem, the transform of u(t), in the £2 sense, is a(f). 
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Example 4.6.1 (A discontinuous baseband function). Let u(t) be a continuous £2 base- 
band function limited to |f| < 1/2. Let v(t) satisfy v(t) = u(t) for all noninteger t and 
v(t) = u(t) + 1 for all integer t. Then u(t) and v(t) are Lo equivalent, but their samples at 
each integer time differ by 1. Their Fourier transforms are the same, say i(f), since the differ- 
ences at isolated points have no effect on the transform. Since tu(f) is nonzero only in [—W, W], 
it is £;. According to the time/frequency dual of Lemma 4.5.1, the point-wise inverse Fourier 
transform of ii(f) is a continuous function, say u(t). Out of all the £2 equivalent waveforms that 
have the transform ii(f), only u(t) is continuous, and it is that u(t) that satisfies the sampling 
theorem. 


The function v(t) is equal to u(t) except for the isolated discontinuities at each integer point. 
One could view v(t) as baseband-limited also, but v(t) is clearly not physically meaningful and 
is not the continuous function of the theorem. 


The above example illustrates an ambiguity about the meaning of baseband-limited functions. 
One reasonable definition is that an £2 function u(t) is baseband-limited to W if a(f) is 0 
for |f| > W. Another reasonable definition is that u(t) is baseband-limited to W if u(t) is 
the pointwise inverse Fourier transform of a function u(f) that is 0 for |f| > W. For a given 
u(f), there is a unique u(t) according to the second definition and it is continuous; all the 
functions that are £2 equivalent to u(t) are bandlimited by the first definition, and all but u(t) 
are discontinuous and potentially violate the sampling equation. Clearly the second definition 
is preferable on both engineering and mathematical grounds. 


Definition: An Ly function is baseband-limited to W if it is the pointwise inverse transform 
of an £2 function i(f) that is 0 for |f| > W. Equivalently, it is baseband-limited to W if it is 
continuous and its Fourier transform is 0 for |f| > 0. 


The DTFT can now be further interpreted. Any baseband-limited Lz function {t(f) 
[-W,W] — C} has both an inverse Fourier transform u(t) = f a(f)e?™/df and a DIFT 
sequence given by (4.58). The coefficients uz of the DTFT are the scaled samples, Tu(kT), of 
u(t), where T = oW- Put in a slightly different way, the DTFT in (4.58) is the Fourier transform 
of the sampling equation (4.65) with u(kT) = u,/T.3! 

It is somewhat surprising that the sampling theorem holds with pointwise convergence, whereas 
its transform, the DTFT, holds only in the £2 equivalence sense. The reason is that the function 
a(f) in the DTFT is £; but not necessarily continuous, whereas its inverse transform u(t) is 
necessarily continuous but not necessarily £1. 

The set of functions {4;(f);k € Z} in (4.63) is an orthogonal set, since the interval [—W, W] 
contains an integer number of cycles from each sinusoid. Thus, from (4.46), the set of sinc 
functions in the sampling equation is also orthogonal. Thus both the DTFT and the sampling 
theorem expansion are orthogonal expansions. It follows (as will be shown carefully later) that 
the energy equation, 


i lu()2dt=T S> lu(kT)P, (4.66) 


k=—0o 


holds for any continuous £2 function u(t) baseband-limited to [-W, W] with T = sw. 


3!Note that the DIFT is the time/frequency dual of the Fourier series but is the Fourier transform of the 
sampling equation. 
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In terms of source coding, the sampling theorem says that any £2 function u(t) that is baseband- 
limited to W can be sampled at rate 2W (i.e., at intervals T = sw) and the samples can later 
be used to perfectly reconstruct the function. This is slightly different from the channel coding 
situation where a sequence of signal values are mapped into a function from which the signals 
can later be reconstructed. The sampling theorem shows that any £2 baseband-limited function 
can be represented by its samples. The following theorem, proved in Section 5A.2, covers the 
channel coding variation: 


Theorem 4.6.3 (Sampling theorem for transmission). Let {a,; k€Z} be an arbitrary se- 
quence of complex numbers satisfying S>, \ax|? < oo. Then >>, a,sinc(2Wt — k) converges 
pointwise to a continuous bounded Lo function {u(t) : R > C} that is baseband-limited to W 
and satisfies a, = u(sty) for each k. 


4.6.3 Source coding using sampled waveforms 


The introduction and Figure 4.1 discuss the sampling of an analog waveform u(t) and quantizing 
the samples as the first two steps in analog source coding. Section 4.2 discusses an alternative in 
which successive segments {u,,(t)} of the source are each expanded in a Fourier series, and then 
the Fourier series coefficients are quantized. In this latter case, the received segments {vm (t) } 
are reconstructed from the quantized coefficients. The energy in Um(t) — Um(t) is given in (4.7) 
as a scaled version of the sum of the squared coefficient differences. This section treats the 
analogous relationship when quantizing the samples of a baseband-limited waveform. 


For a continuous function u(t), baseband-limited to W, the samples {u(kT);k € Z} at intervals 
T = 1/(2W) specify the function. If u(kT) is quantized to v(kT) for each k, and u(t) is 
reconstructed as v(t) = 3°, v(kT) sinc(# — k), then, from (4.66), the mean-squared error is 
given by 


love) [o-e) 

/ lu(t) — v(t) dt=T > |u(KT) — o( TYP. (4.67) 
oO k=—00 

Thus whatever quantization scheme is used to minimize the mean-squared error between a 
sequence of samples, that same strategy serves to minimize the mean-squared error between the 


corresponding waveforms. 


The results in Chapter 3 regarding mean-squared distortion for uniform vector quantizers give 
the distortion at any given bit rate per sample as a linear function of the mean-squared value of 
the source samples. If any sample has an infinite mean-squared value, then either the quantiza- 
tion rate is infinite or the mean-squared distortion is infinite. This same result then carries over 
to waveforms. This starts to show why the restriction to £2 source waveforms is important. It 
also starts to show why general results about £2 waveforms are important. 


The sampling theorem tells the story for sampling baseband-limited waveforms. However, physi- 
cal source waveforms are not perfectly limited to some frequency W; rather, their spectra usually 
drop off rapidly above some nominal frequency W. For example, audio spectra start dropping 
off well before the nominal cutoff frequency of 4 kHz, but often have small amounts of energy 
up to 20 kHz. Then the samples at rate 2W do not quite specify the waveform, which leads to 
an additional source of error, called aliasing. Aliasing will be discussed more fully in the next 
two subsections. 
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There is another unfortunate issue with the sampling theorem. The sinc function is nonzero 
over all noninteger times. Recreating the waveform at the receiver®* from a set of samples 
thus requires infinite delay. Practically, of course, sinc functions can be truncated, but the 
sinc waveform decays to zero as 1/t, which is impractically slow. Thus the clean result of the 
sampling theorem is not quite as practical as it first appears. 


4.6.4 The sampling theorem for [A —W,A + W] 


Just as the Fourier series generalizes to time intervals centered at some arbitrary time A, the 
DTFT generalizes to frequency intervals centered at some arbitrary frequency A. 


Consider an £2 frequency function {é(f) : [A—W, A+W] — C}. The shifted DTFT for o(f) is 


then 
Ct): =. So ope OREN) rect hes where (4.68) 
im. : a . 
1 A+W ; 
kh = — a(fye2™™*F/2W) ag. (4.69) 
2W Ja—w 


Equation (4.68) is an orthogonal expansion, 


(f) = Lim. ) vePi() where 6,(f) = e-27"*F/2) ret Gar 


The inverse Fourier transform of b,( f) can be calculated by shifting and scaling to be 


6,,(t) = 2Wsine(2Wt — k) 2"*4-aw) os, (f) = e 2 *L/OM) rect (Se). (4.70) 


Let v(t) be the inverse Fourier transform of 6(f). 


v(t) = S° vp On(t) = S- 2Wr, sinc(2Wt — k) e2TiA(t— aw) , 
k k 


For t = sa only the kth term above is nonzero, and u( sty) = 2W»,. This generalizes the 


sampling equation to the frequency band [A—W, A+W), 


ult) = $= v5) sinc(2W¢ — k) e2"?A(t- aw), 
k 


Defining the sampling interval T = 1/(2W) as before, this becomes 


t 
v(t) = S~o(kT) sine(= — k) eRe: (4.71) 
k 
Theorems 4.6.2 and 4.6.3 apply to this more general case. That is, with v(t) = 
er o(f)e?*" df, the function v(t) is bounded and continuous and the series in (4.71) con- 


verges for all ¢. Similarly, if 5°, |v(kT)|? < oo, there is a unique continuous £2 function 
{u(t) : [A—W, A+W] — C}, W = 1/(2T) with those sample values. 


Recall that the receiver time reference is delayed from that at the source by some constant r. Thus v(t), 
the receiver estimate of the source waveform u(t) at source time t, is recreated at source time t+ 7. With the 
sampling equation, even if the sinc function is approximated, 7 is impractically large. 
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4.7 Aliasing and the sinc-weighted sinusoid expansion 


In this section an orthogonal expansion for arbitrary £2 functions called the T-spaced sinc- 
weighted sinusoid expansion is developed. This expansion is very similar to the T-spaced trun- 
cated sinusoid expansion discussed earlier, except that its set of orthogonal waveforms consist of 
time and frequency shifts of a sinc function rather than a rectangular function. This expansion 
is then used to discuss the important concept of degrees of freedom. Finally this same expansion 
is used to develop the concept of aliasing. This will help in understanding sampling for functions 
that are only approximately frequency-limited. 


4.7.1 The T-spaced sinc-weighted sinusoid expansion 


Let u(t) — af) be an arbitrary Ly transform pair, and segment i(f) into intervals®* of width 
2W. Thus 


if Sen Anse ES ee i 
i(f) = Lim. d, Om(f), where Om(f) = a(f) reet(5 — m). 
Note that to(f) is non-zero only in [—-W,W)] and thus corresponds to an £2 function vo(t) 


baseband-limited to W. More generally, for arbitrary integer m, 6,,(f) is non-zero only in 
[A—W, A+W] for A = 2Wm. From (4.71), the inverse transform with T = sy satisfies 


tm(t) = So om(RT) sine( = — k) PMREMD 
k 


t , 
SS Unter} sinc(= eee yer me: (4.72) 
k 


Combining all of these frequency segments, 


u(t) = Lim. So v(t) = Lim. 5 Um(kT) sine( = — fp) e2rimt/T. (4.73) 


m,k 


This converges in £2, but does not not necessarily converge pointwise because of the infinite 
summation over m. It expresses an arbitrary £2 function u(t) in terms of the samples of each 
frequency slice, U»(t), of u(t). 


This is an orthogonal expansion in the doubly indexed set of functions 
t 
{UnElt) = sinc(= = penn! ae ke Th. (4.74) 


These are the time and frequency shifts of the basic function wo,9(t) = sinc(4). The time shifts 
are in multiples of T and the frequency shifts are in multiples of 1/T. This set of orthogonal 
functions is called the set of T’-spaced sinc-weighted sinusoids. 

The T-spaced sinc-weighted sinusoids and the T-spaced truncated sinusoids are quite similar. 
Each function in the first set is a time and frequency translate of sinc(4). Each function in 
the second set is a time and frequency translate of rect (4). Both sets are made up of functions 
separated by multiples of T in time and 1/T in frequency. 


33The boundary points between frequency segments can be ignored, as in the case for time segments. 
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4.7.2 Degrees of freedom 


An important rule of thumb used by communication engineers is that the class of real functions 
that are approximately baseband-limited to Wo and approximately time-limited to [—To/2, To /2| 
have about 27pWo real degrees of freedom if To 

Wo >> 1. This means that any function within that class can be specified approximately by 
specifying about 27)Wop real numbers as coefficients in an orthogonal expansion. The same rule 
is valid for complex functions in terms of complex degrees of freedom. 


This somewhat vague statement is difficult to state precisely, since time-limited functions cannot 
be frequency-limited and vice-versa. However, the concept is too important to ignore simply 
because of lack of precision. Thus several examples are given. 


First, consider applying the sampling theorem to real (complex) functions u(t) that are strictly 
baseband-limited to Wo. Then u(t) is specified by its real (complex) samples at rate 2Wo. If the 
samples are nonzero only within the interval [—Tp/2, Ty /2], then there are about 27 Wo nonzero 
samples, and these specify u(t) within this class. Here a precise class of functions have been 
specified, but functions that are zero outside of an interval have been replaced with functions 
whose samples are zero outside of the interval. 


Second, consider complex functions u(t) that are again strictly baseband-limited to Wo, but now 
apply the sinc-weighted sinusoid expansion with W = Wo/(2n + 1) for some positive integer n. 
That is, the band [—Wpo, Wo] is split into 2n + 1 slices and each slice is expanded in a sampling- 
theorem expansion. Each slice is specified by samples at rate 2W, so all slices are specified 
collectively by samples at an aggregate rate 2Wo as before. If the samples are nonzero only 
within [—To/2, 7/2], then there are about?+ 27)Wo nonzero complex samples that specify any 
u(t) in this class. 


If the functions in this class are further constrained to be real, then the coefficients for the 
central frequency slice are real and the negative slices are specified by the positive slices. Thus 
each real function in this class is specified by about 27 Wo real numbers. 


This class of functions is slightly different for each choice of n, since the detailed interpretation 
of what “approximately time-limited” means is changing. From a more practical perspective, 
however, all of these expansions express an approximately baseband-limited waveform by samples 
at rate 2Wo. As the overall duration Jo of the class of waveforms increases, the initial transient 
due to the samples centered close to —Tp/2 and the final transient due to samples centered close 
to To/2 should become unimportant relative to the rest of the waveform. 


The same conclusion can be reached for functions that are strictly time-limited to [—To/2, To /2| 
by using the truncated sinusoid expansion with coefficients outside of |—Fo, Fo] set to 0. 


In summary, all the above expansions require roughly 2Wo7o numbers for the approximate 
specification of a waveform essentially limited to time Jp and frequency Wo for ToWo large. 


It is possible to be more precise about the number of degrees of freedom in a given time and 
frequency band by looking at the prolate spheroidal waveform expansion (see the Appendix, 
Section 5A.3). The orthogonal waveforms in this expansion maximize the energy in the given 
time/frequency region in a certain sense. It is perhaps simpler and better, however, to live with 
the very approximate nature of the arguments based on the sinc-weighted sinusoid expansion 
and the truncated sinusoid expansion. 


34Calculating this number of samples carefully yields (2n + 1) E + | Ete || ; 
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4.7.3 Aliasing — a time domain approach 


Both the truncated sinusoid and the sinc-weighted sinusoid expansions are conceptually use- 
ful for understanding waveforms that are approximately time- and bandwidth-limited, but in 
practice, waveforms are usually sampled, perhaps at a rate much higher than twice the nominal 
bandwidth, before digitally processing the waveforms. Thus it is important to understand the 
error involved in such sampling. 


Suppose an £2 function u(t) is sampled with T-spaced samples, {u(kT);k € Z}. Let s(t) denote 
the approximation to u(t) that results from the sampling theorem expansion, 


s(t) = So u(kT) sine (= . k) (4.75) 


k 


If u(t) is baseband-limited to W = 1/(2T), then s(t) = u(t), but here it is no longer assumed 
that u(t) is baseband limited. The expansion of u(t) into individual frequency slices, repeated 
below from (4.73), helps in understanding the difference between u(t) and s(t): 


we) = ms Dd, Yn(kP) sinc (= — k) e2rimt/T Giieee (4.76) 
Ont), = fun rect( fT — m)e2""F* df. (4.77) 


For an arbitrary £2 function u(t), the sample points u(kT) might be at points of discontinu- 
ity and thus be questionable. Also (4.75) need not converge, and (4.76) might not converge 
pointwise. To avoid these problems, i(f) will later be restricted beyond simply being Lo. First, 
however, questions of convergence are disregarded and the relevant equations are derived without 
questioning when they are correct. 


From (4.75), the samples of s(t) are given by s(kT) = u(kT), and combining with (4.76), 


s(kT) = u(kT) = S0 vm(kT). (4.78) 


Thus the samples from different frequency slices get summed together in the samples of u(t). 
This phenomenon is called aliasing. There is no way to tell, from the samples {u(kT);k € Z} 
alone, how much contribution comes from each frequency slice and thus, as far as the samples 
are concerned, every frequency band is an ‘alias’ for every other. 


Although u(t) and s(t) agree at the sample times, they differ elsewhere (assuming that u(t) is 
not strictly baseband-limited to 1/(2T)). Combining (4.78) and (4.75), 


t 
- m inc(=, — k). 4. 
s(t) Don (KI) sine( = — k) (4.79) 
The expresssions in (4.79) and (4.76) agree at m = 0, so the difference between u(t) and s(t) is 
= i t Qrimt/T es t 
u(t) — s(t) = 2 2a (= = k) + d aes sinc (+ — k) : 


The first term above is vo(t) — s(t), i.e., the difference in the nominal baseband [—W, W]. This is 
the error caused by the aliased terms in s(t). The second term is the energy in the nonbaseband 
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portion of u(t), which is orthogonal to the first error term. Since each term is an orthogonal 
expansion in the sinc-weighted sinusoids of (4.74), the energy in the error is given by®’ 


[jo — o(0)| dt = 72] Xo oml&P)) + TIO DY fom(R) 
k m0 


k m0 


2 
| (4.80) 
Later, when the source waveform u(t) is viewed as a sample function of a random process U(t), 
it will be seen that under reasonable conditions the expected value of these two error terms are 
approximately equal. Thus, if u(t) is filtered by an ideal low-pass filter before sampling, then 
s(t) becomes equal to vo(t) and only the second error term in (4.80) remains; this reduces the 


expected mean-squared error roughly by a factor of 2. It is often easier, however, to simply 
sample a little faster. 


4.7.4 Aliasing — a frequency domain approach 


Aliasing can be, and usually is, analyzed from a frequency domain standpoint. From (4.79), s(t) 
can be separated into the contribution from each frequency band as 


s(t) = S> Salt): where 5,,(t) = S| Um(kT)sinc (= - t) ; (4.81) 
m k 


Comparing $m(t) to Um(t) = >, Um(kT) sine(# — k) e2rimt/T it is seen that 


tilt =e erry | 


From the Fourier frequency shift relation, }m(f) = 8m(f — 4), so 


im(f) = tml f + =). (4.82) 


Finally, since dm(f) = u(f) rect( fT’ —m), one sees that tn(f + F) = a(f + F) rect( fT). Thus, 
summing (4.82) over m, 


a(f) = So af + =) rect [fT]. (4.83) 


Each frequency slice tm(f) is shifted down to baseband in this equation, and then all these 
shifted frequency slices are summed together, as illustrated in Figure 4.10. This establishes the 
essence of the following aliasing theorem, which is proved in Section 5A.2. 


Theorem 4.7.1 (Aliasing theorem). Let u(f) be Lo, and let u(f) satisfy the condition 
limp} +00 (f)|f|'T© = 0 for some e > 0. Then &(f) is £1, and the inverse Fourier transform 
wt) = faye" df converges pointwise to a continuous bounded function. For any given 
T > 0, the sampling approximation )°,, u(kT) sinc(4 —k) converges pointwise to a continuous 
bounded Lo function s(t). The Fourier transform of s(t) satisfies 


s(f) = lim.) a(f + 7) rect| fT]. (4.84) 


35 As shown by example in Exercise 4.38, s(t) need not be £2 unless the additional restrictions of Theorem 4.7.1 
are applied to &(f). In these bizarre situations, the first sum in (4.80) is infinite and s(t) is a complete failure as 
an approximation to u(t). 
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(ii) 


Figure 4.10: The transform 8(f) of the baseband-sampled approximation s(t) to u(t) is 
constructed by folding the transform ii(f) into [-1/(2T),1/(2T)]. For example, using real 
functions for pictorial clarity, the component a is mapped into a’, b into b’ and c into c’. 
These folded components are added to obtain §(f). If a(f) is complex, then both the real 
and imaginary parts of &(f) must be folded in this way to get the real and imaginary parts 
respectively of §(f). The figure further clarifies the two terms on the right of (4.80). The first 
term is the energy of é(f) — 8(f) caused by the folded components in part (ii) . The final term 
is the energy in part (i) outside of [-T/2, T/2]. 


The condition that lim a(f)f'** = 0 implies that &(f) goes to 0 with increasing f at a faster 
rate than 1/f. Exercise 4.37 gives an example in which the theorem fails in the absence of this 
condition. 


Without the mathematical convergence details, what the aliasing theorem says is that, corre- 
sponding to a Fourier transform pair u(t)  a(f), there is another Fourier transform pair s(t) 
and 8(f); s(t) is a baseband sampling expansion using the T-spaced samples of u(t) and 8(f) is 
the result of folding the transform u(f) into the band [—W, W] with W = 1/(2T). 


4.8 Summary 


The theory of £2 (finite-energy) functions has been developed in this chapter. These are in many 
ways the ideal waveforms to study, both because of the simplicity and generality of their math- 
ematical properties and because of their appropriateness for modeling both source waveforms 
and channel waveforms. 


For encoding source waveforms, the general approach is 
e expand the waveform into an orthogonal expansion 
e quantize the coefficients in that expansion 
e use discrete source coding on the quantizer output. 


The distortion, measured as the energy in the difference between the source waveform and 
the reconstructed waveform, is proportional to the squared quantization error in the quantized 
coefficients. 


For encoding waveforms to be transmitted over communication channels, the approach is 
e map the incoming sequence of binary digits into a sequence of real or complex symbols 
e use the symbols as coefficients in an orthogonal expansion. 


Orthogonal expansions have been discussed in this chapter and will be further discussed in 
Chapter 5. Chapter 6 will discuss the choice of symbol set, the mapping from binary digits, and 
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the choice of orthogonal expansion. 


This chapter showed that every £2 time-limited waveform has a Fourier series, where each 
Fourier coefficient is given as a Lebesgue integral and the Fourier series converges in Lo, 7.€., 
as more and more Fourier terms are used in approximating the function, the energy difference 
between the waveform and the approximation gets smaller and approaches 0 in the limit. 


Also, by the Plancherel theorem, every £2 waveform u(t) (time-limited or not) has a Fourier 
integral @(f). For each truncated approximation, u(t) = u(t)rect(s4), the Fourier integral 
ta(f) exists with pointwise convergence and is continuous. The Fourier integral u(f) is then 
the £2 limit of these approximation waveforms. The inverse transform exists in the same way. 


These powerful £2 convergence results for Fourier series and integrals are not needed for com- 
puting the Fourier transforms and series for the conventional waveforms appearing in exercises. 
They become important both when the waveforms are sample functions of random processes 
and when one wants to find limits on possible performance. In both of these situations, one 
is dealing with a large class of potential waveforms, rather than a single waveform, and these 
general results become important. 


The DTFT is the frequency/time dual of the Fourier series, and the sampling theorem is simply 
the Fourier transform of the DTFT, combined with a little care about convergence. 


The 7T-spaced truncated sinusoid expansion and the T-spaced sinc-weighted sinusoid expansion 
are two orthogonal expansions of an arbitrary £2 waveform. The first is formed by segmenting 
the waveform into T-length segments and expanding each segment in a Fourier series. The 
second is formed by segmenting the waveform in frequency and sampling each frequency band. 
The orthogonal waveforms in each are the time/frequency translates of rect(t/T) for the first 
case and sinc(t/T) for the second. Each expansion leads to the notion that waveforms roughly 
limited to a time interval Jp and a baseband frequency interval Fo have approximately 279 Fo 
degrees of freedom when 7p Fo is large. 


Aliasing is the ambiguity in a waveform that is represented by its T-spaced samples. If an 
Lo waveform is baseband-limited to 1/(2T), then its samples specify the waveform, but if the 
waveform has components in other bands, these components are aliased with the baseband 
components in the samples. The aliasing theorem says that the Fourier transform of the base- 
band reconstruction from the samples is equal to the original Fourier transform folded into that 
baseband. 


4A Appendix: Supplementary material and proofs 


The first part of the appendix is an introduction to countable sets. These results are used 
throughout the chapter, and the material here can serve either as a first exposure or a review. 
The following three parts of the appendix provide added insight and proofs about the results on 
measurable sets. 


4A.1 Countable sets 


A collection of distinguishable objects is countably infinite if the objects can be put into one-to- 
one correspondence with the positive integers. Stated more intuitively, the collection is countably 
infinite if the set of elements can be arranged as a sequence @1,a2,... ,. A set is countable if it 
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contains either a finite or countably infinite set of elements. 


Example 4A.1 (The set of all integers). The integers can be arranged as the sequence 0, 
-1, +1, -2, +2, -3, ... , and thus the set is countably infinite. Note that each integer appears 
once and only once in this sequence, and the one-to-one correspondence is (0 — 1), (—-1 < 
2), (41 © 3), (—2 < 4), etc. There are many other ways to list the integers as a sequence, such 
as 0, -1, +1, +2, -2, +8, +4, -3, +5, ..., but, for example, listing all the non-negative integers 
first followed by all the negative integers is not a valid one-to-one correspondence since there 
are no positive integers left over for the negative integers to map into. 


Example 4A.2 (The set of 2-tuples of positive integers). Figure 4.11 shows that this set 
is countably infinite by showing one way to list the elements in a sequence. Note that every 
2-tuple is eventually reached in this list. In a weird sense, this means that there are as many 
positive integers as there are pairs of positive integers, but what is happening is that the integers 
in the 2-tuple advance much more slowly than the position in the list. For example, it can be 
verified that (n,n) appears in position 2n(n — 1) + 1 of the list. 


@ (2.4) @ (3,4) @ (4,4) 


@ (3,3) @ (4,3) 


and so forth 


Figure 4.11: A one-to-one correspondence between positive integers and 2-tuples of 
positive integers. 


By combining the ideas in the previous two examples, it can be seen that the collection of all 
integer 2-tuples is countably infinite. With a little more ingenuity, it can be seen that the set of 
integer n-tuples is countably infinite for all positive integer n. Finally, it is straightforward to 
verify that any subset of a countable set is also countable. Also a finite union of countable sets 
is countable, and in fact a countable union of countable sets must be countable. 


Example 4A.3. (The set of rational numbers] Each rational number can be represented by an 
integer numerator and denominator, and can be uniquely represented by its irreducible numer- 
ator and denominator. Thus the rational numbers can be put into one-to-one correspondence 
with a subset of the collection of 2-tuples of integers, and are thus countable. The rational 
numbers in the interval [-T'/2,T/2] for any given T > 0 form a subset of all rational numbers, 
and therefore are countable also. 


As seen in Subsection 4.3.1, any countable set of numbers aj, a2,--- can be expressed as a disjoint 
countable union of zero-measure sets, [a@1, 1], [@2,@2],--- so the measure of any countable set 
is zero. Consider a function that has the value 1 at each rational argument and 0 elsewhere. 
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The Lebesgue integral of that function is 0. Since rational numbers exist in every positive-sized 
interval of the real line, no matter how small, the Riemann integral of this function is undefined. 
This function is not of great practical interest, but provides insight into why Lebesgue integration 
is so general. 


Example 4A.4 (The set of binary sequences). An example of an uncountable set of ele- 
ments is the set of (unending) sequences of binary digits. It will be shown that this set contains 
uncountably many elements by assuming the contrary and constructing a contradiction. Thus, 
suppose we can list all binary sequences, @1, a2, a@3,.... Each sequence, a,, can be expressed 
aS An = (An,1,4n,2,---), resulting in a doubly infinite array of binary digits. We now construct 
a new binary sequence b = 04, b2,..., in the following way. For each integer n > 0, choose 
bn # Ann; Since by, is binary, this specifies b, for each n and thus specifies 6. Now 6 differs from 
each of the listed sequences in at least one binary digit, so that b is a binary sequence not on 
the list. This is a contradiction, since by assumption the list contains each binary sequence. 


This example clearly extends to ternary sequences and sequences from any alphabet with more 
than one member. 


Example 4A.5 (The set of real numbers in [{0,1)). This is another uncountable set, and 
the proof is very similar to that of the last example. Any real number r € [0, 1) can be represented 
as a binary expansion 0.71r2,--- whose elements rz are chosen to satisfy r = )77°, r,p2—* and 
where each rz € {0,1}. For example, 1/2 can be represented as 0.1, 3/8 as 0.011, etc. This 
expansion is unique except in the special cases where r can be represented by a finite binary 
expansion, r = )>;_, x; for example, 1/2 can also be represented as 0.0111---. By convention, 
for each such r (other than r = 0) choose m as small as possible; thus in the infinite expansion, 
Tm = 1 and ry, = 0 for all k > m. Each such number can be alternatively represented with 
Tm = 0 and rz = 1 for all k > m. 


By convention, map each such r into the expansion terminating with an infinite sequence of 
zeros. The set of binary sequences is then the union of the representations of the reals in [0, 1) 
and the set of binary sequences terminating in an infinite sequence of 1’s. This latter set is 
countable because it is in one-to-one correspondence with the rational numbers of the form 
yo r,2—" with binary r;, and finite m. Thus if the reals were countable, their union with this 
latter set would be countable, contrary to the known uncountability of the binary sequences. 


By scaling the interval [0,1), it can be seen that the set of real numbers in any interval of 
non-zero size is uncountably infinite. Since the set of rational numbers in such an interval is 
countable, the irrational numbers must be uncountable (otherwise the union of rational and 
irrational numbers, i.e., the reals, would be countable). 


The set of irrationals in [—-T'/2,T/2] is the complement of the rationals and thus has measure 
T. Each pair of distinct irrationals is separated by rational numbers. Thus the irrationals can 
be represented as a union of intervals only by using an uncountable union®® of intervals, each 
containing a single element. The class of uncountable unions of intervals is not very interesting 
since it includes all subsets of R. 


°6This might be a shock to one’s intuition. Each partial union Whar [aj;,a;] of rationals has a complement 
which is the union of k + 1 intervals of non-zero width; each unit increase in k simply causes one interval in the 
complement to split into two smaller intervals (although maintaining the measure at T). In the limit, however, 
this becomes an uncountable set of separated points. 
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4A.2 Finite unions of intervals over |—T/2, T/2| 


Let My be the class of finite unions of intervals, 7.e., the class of sets whose elements can each 
be expressed as € = cae I; where {I,... , Je} are intervals and @ > 1 is an integer. Exercise 
4.5 shows that each such € € My can be uniquely expressed as a finite union of k < ¢ separated 
intervals, say € = Wee I,. The measure of € was defined as pu(€) = i u(Ii). Exercise 4.7 


shows that p(E) < yea u(I;) for the original intervals making up € and shows that this holds 


with equality whenever [,,... , Ig are disjoint.°” 


The class M is closed under the union operation, since if €; and € are each finite unions of 
intervals, then €; U €2 is the union of both sets of intervals. It also follows from this that if €1 
and €2 are disjoint then 


m(Ey U Eg) = (Er) + w(E2). (4.85) 


The class My, is also closed under the intersection operation, since, if €; = U lig and & = 
U, Loe, then €) NE. = Us 0(L,; 1p ¢). Finally, M+ is closed under complementation. In fact, as 
illustrated in Figure 4.5, the complement € of a finite separated union of intervals € is simply 
the union of separated intervals lying between the intervals of €. Since € and its complement € 
are disjoint and fill all of [-T'/2, T’/2], each € € My satisfies the complement property, 


T = p(E) + p(E). (4.86) 
An important generalization of (4.85) is the following: for any €1,€ € My, 


ME, U Ea) + w(Er ON E2) = w(E1) + w(E2). (4.87) 


To see this intuitively, note that each interval in €; M €) is counted twice on each side of (4.87), 
whereas each interval in only €; or only €2 is counted once on each side. More formally, €; Ug = 
E, U (E2N €1). Since this is a disjoint union, (4.85) shows that y(E, U €2) = w(E1) + w(E2N 1). 
Similarly, (E2) = (Ex E1) + (Eg M Er). Combining these equations results in (4.87). 


4A.3 Countable unions and outer measure over [—T'/2, T’/2| 


Let M, be the class of countable unions of intervals, i.e., each set B € M, can be expressed as 
B =U, Jj where {h, I2...} is either a finite or countably infinite collection of intervals. The 
class M, is closed under both the union operation and the intersection operation by the same 
argument as used for M,. M, is also closed under countable unions (see Exercise 4.8) but not 
closed under complements or countable intersections.*® 


Each B € M, can be uniquely? expressed as a countable union of separated intervals, say 
B=U, I; where {Ij, [5,...} are separated (see Exercise 4.6). The measure of B is defined as 


y(B) = So pL). (4.88) 
J 


37Recall that intervals such as (0,1], (1,2] are disjoint but not separated. A set € € M, has many representations 
as disjoint intervals but only one as separated intervals, which is why the definition refers to separated intervals. 

38 Appendix 4A.1 shows that the complement of the rationals, i.e., the set of irrationals, does not belong to 
M.,. The irrationals can also be viewed as the intersection of the complements of the rationals, giving an example 
where M, is not closed under countable intersections. 

3°What is unique here is the collection of intervals, not the particular ordering; this does not affect the infinite 
sum in (4.88) (see Exercise 4.4). 
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As shown in Subsection 4.3.1, the right side of (4.88) always converges to a number between 
QO and T. For B= ; 1; where fi, [2,..., are arbitrary intervals, Exercise 4.7 establishes the 
following union bound, 


u(B) < Ss" p(L;) with equality if 1), Io,...are disjoint. (4.89) 
J 


The outer measure °(A) of an arbitary set A was defined in (4.13) as 


POA) pe ne ee): (4.90) 


Note that [-T'/2, T’/2] is a cover of A for all A (recall that only sets in [-T/2, T/2] are being 
considered). Thus °(A) must lie between 0 and T for all A. Also, for any two sets A C A’, 
any cover of A’ also covers A. This implies the subset inequality for outer measure, 


p°(A) <p(A’) = for AC A’. (4.91) 


The following lemma develops the union bound for outer measure. Its proof illustrates several 
techniques that will be used frequently. 


Lemma 4A.1. Let S=U, Ax be a countable union of arbitrary sets in [—-T/2, T/2]. Then 


1°(S) < w(Ad), (4.92) 
k 


Proof: The approach is to first establish an arbitrarily tight cover to each A; and then show 
that the union of these covers is a cover for S. Specifically, let ¢ be an arbitrarily small positive 
number. For each k > 1, the infimum in (4.90) implies that covers exist with measures arbitrarily 
little greater than that infimum. Thus a cover B; to A; exists with 


p(By) < e2-* + p°(Ap). 


For each k, let By = U; Tie where I} ,,[),,-.- represents B, by separated intervals. Then 
B=U, Br =U; U; 44, is a countable union of intervals, so from (4.89) and Exercise 4.4, 


w(B) <S2S— why) = S5 w(Br) 
eG k 


Since B;, covers A, for each k, it follows that B covers S. Since °(S) is the infimum of its 
covers, 


1°(S) < w(B) < Yo (Br) < Yo (e2-* + p(An)) =e + 0 (Ad), 
k k k 


Since € > 0 is arbitrary, (4.92) follows. 


An important special case is the union of any set A and its complement A. Since [—T/2, T/2] = 
AUA, 

T < p°(A) + (A). (4.93) 
The next subsection will define measurability and measure for arbitrary sets. Before that, the 
following theorem shows both that countable unions of intervals are measurable and that their 
measure, as defined in (4.88), is consistent with the general definition to be given later. 
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Theorem 4A.1. Let B = U; I; where {l),12,...} is a countable collection of intervals in 
[-T'/2, T/2] fie., BE Me-). Then 


u°(B) + p°(B)=T and (4.94) 


y°(B) = u(B). (4.95) 


k 
eS lal f then 
— 


For any € > 0, choose & large enough that 
u(E*) > p(B) -e. (4.96) 


The idea of the proof is to approximate B by €*, which, being in M ¢, satisfies T = we) +E). 
Thus, 


u(B) < p(é*) +e =T - w(E*) +e <T— p(B) +8, (4.97) 


where the final inequality follows because €* C B and thus B C €* and p°(B) < p(E*). 


Next, since B € M, and B C B, B is a cover of itself and is a choice in the infimum defining 


u°(B); thus u°(B) < w(B). Combining this with (4.97), u°(B) + u°(B) < T +. Since ¢ > 0 is 
arbitrary, this implies 


u°(B) + p°(B) < T. (4.98) 


This combined with (4.93) establishes (4.94). Finally, substituting T < p°(B) + w°(B) into 
(4.97), uw(B) < u°(B) +e. Since w°(B) < u(B) and <« > 0 is arbitrary, this establishes (4.95). 


Finally, before proceeding to arbitrary measurable sets, the joint union and intersection property, 
(4.87), is extended to M.. 


Lemma 4A.2. Let B, and Bg be arbitrary sets in M,. Then 


H(By U Bg) + p(B, Ba) = w(B1) + (Be). (4.99) 


Proof: Let 6, and Bz be represented respectively by separated intervals, B; = F Ty; and 
By = U; In;. For € = 1,2, let EF = ee Ip; and DF = Ujena Ip;. Thus Bg = EF UD} for each 
integer k > 1 and €= 1,2. The proof is based on using ey , which is in My and satisfies the joint 
union and intersection property, as an approximation to By. To see how this goes, note that 


Ben Bs = (CP UD) (Es UDE) S (EF NE WER DE) UPi AB). 
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For any ¢ > 0 we can choose k large enough that u(€*) > u(Bp) — ¢ and p(D*) < e for = 1,2. 
Using the subset inequality and the union bound, we then have 


(By 2 Bo) w(Et MEX) + w(D5) + u(D") 


ples ES) 426. 


IN IA 


By a similar but simpler argument, 


p(B, U Bs) < WEP UES) + w(Df) + u(D3) 


a 
Ser Wes oe 


Combining these inequalities and using (4.87) on €& CM f and EP CM f, we have 


(B10 Bz) + w(ByU Bz) < wf O EX) + WEP U Ef) + 4e 
= (ET) + w(E) + 4e 
< p(B1) + w(B2) + 4e. 


where we have used the subset inequality in the final inequality. 


For a bound in the opposite direction, we start with the subset inequality, 


(By U Bz) + u(BiN Bz) > pw(EPU EX) + w(EFN Ef) 
ult) + HED) 
> p(Bi) + (Bg) — 2e. 


Since ¢ is arbitrary, these two bounds establish (4.99). 


4A.4 Arbitrary measurable sets over |—T/2, T/2| 
An arbitrary set A € [—T/2, T/2] was defined to be measurable if 

T = p°(A) + p°(A). (4.100) 
is denoted as M. Theorem 4A.1 shows that each set B € is measurable, t.e., B © M and 


Me 
thus Mr CM. C M. The measure of B € M, is u(B) = >), w(Z;) for any disjoint sequence of 
intervals, [,, [2,... , whose union is B. 


The measure of a measurable set was defined to be pu(A) = °(.A). The class of measurable sets 
j 


Although the complements of sets in M, are not necessarily in M, (as seen from the rational 
number example), they must be in M; in fact, from (4.100), all sets in M have complements 
in M, i.e., M is closed under complements. We next show that M is closed under, first, finite, 
and then, countable, unions and intersections. The key to these results is to first show that the 
joint union and intersection property is valid for outer measure. 


Lemma 4A.3. For any measurable sets A, and Ag, 
p?(Ay U Ag) + u°(A1 OM Ag) = u°(Ar) + p°(A2). (4.101) 


Proof: The proof is very similar to that of lemma 4A.2, but here we use sets in M, to approx- 
imate those in M. For any ¢ > 0, let B; and Bo be covers of A; and A2 respectively such that 
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u(Be) < p°(Ag) +e for 2=1,2. Let De = BpN Ay for = 1,2. Note that Ay and Dy are disjoint 
and By = Ag U Dy. 
By, By = (A, UD) N (Ag U De) = (AN Ag) U (D1 Az) U (Bi N D2). 


Using the union bound and subset inequality for outer measure on this and the corresponding 
expansion of 6; U Bg, we get 


(Bi NBs) < p°(AL Ag) + p°(D1) + w?( De) < pe( AL Ag)+2€ 
(By, UBg) < p°(A,U Ag)+y°(D1) + pe(Da) < po(A1 U Ag) +2¢, 


where we have also used the fact (see Exercise 4.9) that ~°(Dy) < ¢ for = 1,2. Summing these 
inequalities and rearranging terms, 


p°(Ay U Ag) + u°(A1N Az) > p(B, 7 Bo) + w(B, U Bo) — 4e 
= p(Bi)+H(B2) — 4e 
u°(Ai)+p°(A2) — 4, 


where we have used (4.99) and then used A, C Gy for = 1,2. Using the subset inequality and 
(4.99) to bound in the opposite direction, 


H(B1) + (Be) = (Bi U Bz) + w(B1 OM Be) > u°(A1 U A2)+p°(A1 2 AQ). 
Rearranging and using (By) < w°(Ae) +6, 
(ALU Aa) +P (ALN Aa) < He(A1) + WP(A2) + 2e. 
Siince ¢€ is arbitrary, these bounds establish (4.101). 
Theorem 4A.2. Assume A;j,A2 EM. Then Ay UAp € M and AyN Ag € M. 


IV 


Proof: Apply (4.101) to A; and Ag, getting 
p?(Ay U Ag) + p°(Ar NM Ag) = u°(Al) + 2°(Ad). 
Rewriting A; U Ag as Ai Ag and Ai M Ag by Ai U Ag and adding this to (4.101), 
es U Ag) + u°(Ay U Aa Aa z en N Aa) + 1°(Ar AG As)| 
= pe(Ar) + w(Aa) + HO(Ar) + (Aa) = 27, (4.102) 


where we have used (4.100). Each of the bracketed terms above is at least T from (4.93), so 
each term must be exactly T. Thus A; U A2 and A; M Az are measurable. 


Since A; U Ag and A; MA» are measurable if A; and A» are, the joint union and intersection 
property holds for measure as well as outer measure for all measurable functions, i.e., 


U( Ay U Ag) + w(A1 M Az) = (Ar) + p(A2). (4.103) 
If A; and Ag are disjoint, then (4.103) simplifies to the additivity property 
WA U Ag) = p(A1) + (Ad). (4.104) 


Actually, (4.103) shows that (4.104) holds whenever 4(A;9 Az) = 0. That is, Ay and Ag need 
not be disjoint, but need only have an intersection of zero measure. This is another example in 
which sets of zero measure can be ignored. 

The following theorem shows that M is closed over disjoint countable unions and that M is 
countably additive. 
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Theorem 4A.3. Assume that Aj © M for each integer j > 1 and that (Aj; Ag) = 0 for all 
GFE. Let A=U; Aj. Then AEM and 


(A) = 0 u(A;). (4.105) 
j 


Proof: Let A* = hae A, for each integer k > 1. Then A*t! = A* U Agyi and, by induction 
on the previous theorem, A* € M. It also follows that 


k 


p(A®) = S$ p(Aj). 


7 


The sum on the right is nondecreasing in & and bounded by T, so the limit as k — oo exists. 
Applying the union bound for outer measure to A, 


w°(A) < n° Ay) = Jim p°(AS) = Jim (AS), (4.106) 


j 
Since A* C A, we see that AC A* and p°(A) < u(A*) = T — p(A*). Thus 
pA aT= Jim pA”: (4.107) 


Adding (4.106) and (4.107) shows that y°(A) + u°(A) < T. Combining with (4.93), u°(A) + 
p°(A) = T and (4.106) and (4.107) are satisfied with equality. Thus A € M and countable 
additivity, (4.105), is satisfied. 


Next it is shown that M is closed under arbitrary countable unions and intersections. 


Theorem 4A.4. Assume that Aj € M for each integer j > 1. Then A= Uj; Aj and D = [); Aj 
are both in M. 


Proof: Let A’, = A; and, for each k > 1, let AP = Cae Aj and let Ay, = Angin Ak. 


By induction, the sets A{,A},..., are disjoint and measurable and A = U; Aj. Thus, from 
Theorem 4A.3, A is measurable. Next suppose D = .A;. Then D = UA;. Thus, D € M, so 
De€eM also. 


Proof of Theorem 4.3.1: The first two parts of Theorem 4.3.1 are Theorems 4A.4 and 
44.3. The third part, that A is measurable with zero measure if u°(A) = 0, follows from 
T < p°(A) + u°(A) = °(A) and °(A) < T, i.e., that u°(A) = T. 
Sets of zero measure are quite important in understanding Lebesgue integration, so it is impor- 
tant to know whether there are also uncountable sets of points that have zero measure. The 
answer is yes; a simple example follows. 


Example 4A.6 (The Cantor set). Express each point in the interval (0,1) by a ternary ex- 
pansion. Let B be the set of points in (0,1) for which that expansion contains only 0’s and 2’s 
and is also nonterminating. Thus B excludes the interval [1/3,2/3), since all these expansions 
start with 1. Similarly, 6 excludes [1/9,2/9) and [7/9,8/9), since the second digit is 1 in these 
expansions. The right end point for each of these intervals is also excluded since it has a ter- 
minating expansion. Let B, be the set of points with no 1 in the first n digits of the ternary 
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expansion. Then p(B,,) = (2/3)”. Since B is contained in 6, for each n > 1, B is measurable 
and p(B) = 0. 

The expansion for each point in B is a binary sequence (viewing 0 and 2 as the binary digits 
here). There are uncountably many binary sequences (see Section 4A.1), and this remains true 
when the countable number of terminating sequences are removed. Thus we have demonstrated 
an uncountably infinite set of numbers with zero measure. 


Not all point sets are Lebesgue measurable, and an example follows. 


Example 4A.7 (A non-measurable set). Consider the interval [0, 1). We define a collection 
of equivalence classes where two points in [0, 1) are in the same equivalence class if the difference 
between them is rational. Thus one equivalence class consists of the rationals in [0,1). Each other 
equivalence class consists of a countably infinite set of irrationals whose differences are rational. 
This partitions [0,1) into an uncountably infinite set of equivalence classes. Now consider a set 
A that contains exactly one number chosen from each equivalence class. We will assume that A 
is measurable and show that this leads to a contradiction. 


For the given set A, let A+7r, for r rational in (0,1), denote the set that results from mapping 
each t € A into either t+ r or t+ r —1, whichever lies in [0, 1). The set A+, r is thus the set A, 
shifted by r, and then rotated to lie in [0, 1). By looking at outer measures, it is easy to see that 
A-+r is measurable if A is and that both then have the same measure. Finally, each t € [0, 1) 
lies in exactly one equivalence class, and if 7 is the element of A in that equivalence class, then t 
lies in A+r where r = t—7 or t—7+1. In other words, [0,1) = U,(A+r) and the sets A+r are 
disjoint. Assuming that A is measurable, Theorem 4A.3 asserts that 1 = >, w(A+r). The sum 
on the right, however, is 0 if u(A) = 0 and infinite if ~(A) > 0, establishing the contradiction. 
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4.E Exercises 


4.1. (Fourier series) (a) Consider the function u(t) = rect(2t) of Figure 4.2. Give a general 
expression for the Fourier series coefficients for the Fourier series over [—1/2, 1/2]. and 
show that the series converges to 1/2 at each of the end points, -1/4 and 1/4. Hint: You 
don’t need to know anything about convergence here. 

(b) Represent the same function as a Fourier series over the interval [—1/4,1/4]. What 
does this series converge to at -1/4 and 1/4? Note from this exercise that the Fourier series 
depends on the interval over which it is taken. 


4.2. (Energy equation) Derive (4.6), the energy equation for Fourier series. Hint: Substitute the 
Fourier series for u(t) into f u(t)u*(t) dt. Don’t worry about convergence or interchange of 
limits here. 


4.3. (Countability) As shown in Appendix 4A.1, many subsets of the real numbers, including 
the integers and the rationals, are countable. Sometimes, however, it is necessary to give 
up the ordinary numerical ordering in listing the elements of these subsets. This exercise 
shows that this is sometimes inevitable. 

(a) Show that every listing of the integers (such as 0,—1,1,—2,...) fails to preserve the 
numerical ordering of the integers (hint: assume such a numerically ordered listing exists 
and show that it can have no first element (i.e., no smallest element.) 

(b) Show that the rational numbers in the interval (0, 1) cannot be listed in a way that 
preserves their numerical ordering. 

(c) Show that the rationals in [0,1] cannot be listed with a preservation of numerical ordering 
(the first element is no problem, but what about the second?). 


4.4. (Countable sums) Let a1, a@2,..., be a countable set of non-negative numbers and assume 
that sa(k) = ae a; < A for all k and some given A > 0. 
(a) Show that the limit limz.. 5a(k) exists with some value S, between 0 and A. (Use 
any level of mathematical care that you feel comfortable with.) 
(b) Now let 61, b2,..., be another ordering of the numbers aj, q@2,... ,. That is, let b) = 
G51); 2 = aj(2),--- , be = Ajg),--- , Where j(¢) is a permutation of the positive integers, i.e., 
a one-to-one function from Z* to Zt. Let s,(k) = ye be. Show that limg_.oo sp(k) < So. 


Hint: Note that ; : 
Dd be= dao: 
f=1 l=1 


(c) Define S, = limp... $o(K) and show that S, > S,. Hint: Consider the inverse permua- 
tion, say €~!(j), which for given j’ is that ¢ for which j(¢) = j’. Note that you have shown 
that a countable sum of non-negative elements does not depend on the order of summation. 


(d) Show that the above result is not necessarily true for a countable sum of numbers that 
can be positive or negative. Hint: consider alternating series. 


4.5. (Finite unions of intervals) Let € = ieee I; be the union of ¢ > 2 arbitrary nonempty 
intervals. Let a; and b; denote the left and right end points respectively of Ij; each end 
point can be included or not. Assume the intervals are ordered so that aj < ag <--- < ag. 
(a) For = 2, show that either J; and J, are separated or that € is a single interval whose 
left end point is ay. 
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(b) For £>2and2<k< 4, let €* = eae I;. Give an algorithm for constructing a union 
of separated intervals for E*+! given a union of separated intervals for €°. 

(c) Note that using part (b) inductively yields a representation of € as a union of separated 
intervals. Show that the left end point for each separated interval is drawn from ay,... , ae 
and the right end point is drawn from 0j,... , bg. 

(d) Show that this representation is unique, i.e.., that € can not be represented as the 
union of any other set of separated intervals. Note that this means that pu(€) is defined 
unambiguously in (4.9). 

4.6. (Countable unions of intervals) Let B = U I; be a countable union of arbitrary (perhaps 
intersecting) intervals. For each k > 1, let BY = ies I; and for each k > j, let Ij, be the 
separated interval in B* containing I; (see Exercise 4.5). 

(a) For each k > j > 1, show that Ij, C Ijn41. 

(b) Let Ure; Jin = Ij. Explain why J; is an interval and show that Ij C B. 

(c) For any #, j, show that either I; = Ij or I and J; are separated intervals. 

(d) Show that the sequence {Ij;1 < j < oo} with repetitions removed is a countable 
separated-interval representation of B. 


(ec) Show that the collection {Jj; j > 1} with repetitions removed is unique; i.e., show that 
if an arbitrary interval J is contained in 6, then it is contained in one of the I i" Note 
however that the ordering of the I i is not unique. 

4.7. (Union bound for intervals) Prove the validity of the union bound for a countable collection 
of intervals in (4.89). The following steps are suggested: 
(a) Show that if B = 1, U Jy for arbitrary intervals 1, 2, then u(B) < w(t) + wU2) with 
equality if J; and Ig are disjoint. Note: this is true by definition if J; and Ig are separated, 
so you need only treat the cases where J; and J» intersect or are disjoint but not separated. 
(b) Let B* = aa I; be represented as the union of say m,, separated intervals (m;, < k), 
so BY = Ue I. Show that (B* U Ips1) < u(B") + w(Ips1) with equality if B* and Iya, 
are disjoint. 
(c) Use finite induction to show that if B = Bey I; is a finite union of arbitrary intervals, 
then p(B) < 24 u(1;) with equality if the intervals are disjoint. 
(d) Extend part (c) to a countably infinite union of intervals. 

4.8. For each positive integer n, let B,, be a countable union of intervals. Show that B = Ur, Bn 
is also a countable union of intervals. Hint: Look at Example 4A.2 in Section 4A.1. 


4.9. (Measure and covers) Let A be an arbitrary measurable set in [—T/2, T/2] and let B be 
a cover of A. Using only results derived prior to Lemma 4A.3, show that u°(BM A) = 
u(B) — (A). You may use the following steps if you wish. 

(a) Show that u°(BM A) > p(B) — pA). 
(b) For any 6 > 0, let B’ be a cover of A with p(B’) < (A) +6. Use Lemma 4A.2 to show 
that u(BN B’) = u(B) + p(B’) —-T. 
(c) Show that °(BON A) < p(BOB’) < p(B) — (A) +6. 
(d) Show that u°(BM A) = p(B) — pA). 
4.10. (Intersection of covers) Let A be an arbitrary set in [—T/2, T/2]. 
(a) Show that A has a sequence of covers, B,,62,... such that y°(A) = ~(D) where 
DE EBs 
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(b) Show that AC D. 

(c) Show that if A is measurable, then u(D MA) = 0. Note that you have shown that an 
arbitrary measurable set can be represented as a countable intersection of countable unions 
of intervals, less a set of zero measure. Argue by example that if A is not measurable, then 
p°(D A) need not be 0. 

4.11. (Measurable functions) (a) For {u(t) : [-T/2, T/2] — R}, show that if {t : u(t) < G} is 
measurable, then {t : u(t) > 3G} is measurable. 

(b) Show that if {t : u(t) < G} and {t : u(t) < a} are measurable, a < (3, then {t: a < 
u(t) < 3} is measurable. 

(c) Show that if {t : u(t) < @} is measurable for all 3, then {t : u(t) < 3} is also measurable. 
Hint: Express {t : u(t) < @} as a countable intersection of measurable sets. 

(d) Show that if {t : u(t) < G} is measurable for all 3, then {t : u(t) < 3} is also measurable, 
i.e., the definition of measurable function can use either strict or nonstrict inequality. 

4.12. (Measurable functions) Assume throughout that {u(t) : [—T'/2,7/2] — R} is measurable. 
(a) Show that —u(t) and |u(t)| are measurable. 

(b) Assume that {g(az) : R — R} is an increasing function (i.e., 71 < %2 = > g(x1) < 
g(x2)). Prove that v(t) = g(u(t)) is measurable Hint: This is a one liner. If the abstraction 
confuses you, first show that exp(u(t)) is measurable and then prove the more general 
result. 

(c) Show that exp[u(t)], u2(t), and In |u(t)| are all measurable. 

4.13. (Measurable functions) (a) Show that if{u(t) : [-T/2,T/2] — R} and {v(t) : |-T'/2,T/2] - 
R} are measurable, then u(t) + v(t) is also measurable. Hint: Use a discrete approximation 
to the sum and then go to the limit. 

(b) Show that u(t)v(t) is also measurable. 

4.14. (Measurable sets) Suppose A is a subset of |—T'/2, T'/2] and is measurable over [—T/2, T’/2]. 
Show that A is also measurable, with the same measure, over [—T’/2, T’/2] for any T’ 
satisfying T’ > T. Hint: Let y’(A) be the outer measure of A over [—T’/2, T’/2] and show 
that y/(A) = u°(A) where p° is the outer measure over [—T'/2, T'/2]. Then let A’ be the 
complement of A over [—T’/2, T’/2] and show that p/(A’) = p°(A) + T’ —T. 

4.15. (Measurable limits) (a) Assume that {u,,(t) : [-T/2, T/2] — R} is measurable for each 
n > 1. Show that liminf, un(t) is measurable ( liminf, un(t) means limm Vm(t) where 
Um(t) = infP@,,, Un(t) and infinite values are allowed). 

(b) Show that lim, u,(t) exists for a given t if and only if lim inf, u,(t) = lim sup,, un(t). 
(c) Show that the set of t for which lim, u,(t) exists is measurable. Show that a function 
u(t) that is lim, u,(t) when the limit exists and is 0 otherwise is measurable. 

4.16. (Lebesgue integration) For each integer n > 1, define u,(t) = 2” rect(2”t — 1). Sketch 
the first few of these waveforms. Show that limp. Un(t) = 0 for all t. Show that 
f limp un(t) dt F limy f un(t) dt. 

4.17. (£1 integrals)) (a) Assume that {u(t) : [—-T'/2, T/2] > R} is £1. Show that 


[wear z [fra fw a < f u(t) at 
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(b) Assume that {u(t) : [-T/2, 1/2] — C} is £1. Show that 


[uo at < [iw dt. 


Hint: Choose a such that a f u(t) dt is real and nonnegative and |a| = 1. Use part (a) on 
au(t). 

4.18. (Lo equivalence) Assume that {u(t) : [-T'/2, T/2] — C} and {v(t) : |-T/2, T/2] — C} are 
Lo functions. 
(a) Show that if u(t) and v(t) are equal a.e., then they are £2 equivalent. 
(b) Show that if u(t) and v(t) are Lz equivalent, then for any ¢ > 0, the set {t : |u(t) — 
vu(t)|? > e} has zero measure. 
(c) Using (b), show that p{t: |u(t) — v(t)| > 0} =0, ze., that u(t) = v(t) ae. 

4.19. (Orthogonal expansions) Assume that {u(t) : R — C} is Lo. Let {0,(t); 1<k< co} bea 
set of orthogonal waveforms and assume that u(t) has the orthogonal expansion 


CO 
— Ss" upOy(t) 
k=1 
Assume the set of orthogonal waveforms satisfy 


oe * afi: dor shee g 


where {Aj} is an arbitrary set of positive numbers. Do not concern yourself with conver- 
gence issues in this exercise. 
(a) Show that each uz can be expressed in terms of [°° u(t)6;(t) dt and Ax. 
(b) Find the energy [°° |u(t)|?d¢ in terms of {ug}, and a 
(c) Suppose that oe = )°, vr9n(t) where v(t) also has finite energy. Express 
[oS u(t)v* (t) dt as a function of {ug, vp, Ag; k € Z}. 

4.20. (Fourier series) (a) Verify that (4.22) and (4.23) follow from (4.20) and (4.18) using the 
transformation u(t) = v(t + A). 


(b) Consider the Fourier series in periodic form, w(t) = So, tipe?™*/T where wy = 
(1/T) ) frrjo wt Je 27*kt/T dt. Show that for any real A, (1/T) yore ete (t)e—27*t/T dt is 


also equal to wz, providing an alternate derivation of (4.22) and (4.23). 
4.21. Equation (4.27) claims that 


lim [feo - 5 3 fie mOism(t)| dt = 0 


n—-00,l— 00 
m=—n k=—e 
(a) Show that the integral above is non-increasing in both @ and n. 


(b) Show that the limit is independent of how n and ¢ approach oo. Hint: See Exercise 4.4. 


(c) More generally, show that the limit is the same if the pair (k,m), k € Z,m € Z is 
ordered in an arbitrary way and the limit above is replaced by a limit on the partial sums 
according to that ordering. 
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4.22. (Truncated sinusoids) (a) Verify (4.24) for Lo waveforms, i.e., show that 


2 
im, f Jute) - > un(6)| dt = 0. 


(b) Break the integral in (4.28) into separate integrals for |t| > (n+-35)T and |t| < (n+5)T. 
Show that the first integral goes to 0 with increasing n. 
(c) For given n, show that the second integral above goes to 0 with increasing @. 

4.23. (Convolution) The left side of (4.40) is a function of t. Express the Fourier transform of 
this as a double integral over t and 7. For each t, make the substitution r = t — 7 and 
integrate over r. Then integrate over 7 to get the right side of (4.40). Do not concern 
yourself with convergence issues here. 

4.24. (Continuity of £; transform) Assume that {u(t) : R — C} is 2; and let ui(f) be its Fourier 
transform. Let ¢ be any given positive number. 

(a) Show that for sufficiently large T’, Sior |u(t)e—27*Ft — u(t)e27—9)"| dt < €/2 for all f 
and all 6 > 0. 

(b) For the < and T' selected above, show that Jer |u(t)e27*Ft — u(t)e274F-9)4| dt < €/2 
for all f and sufficiently small 6 > 0. This shows that u(f) is continuous. 

4.25. (Plancherel) The purpose of this exercise is to get some understanding of the Plancherel 
theorem. Assume that u(t) is £2 and has a Fourier transform w(f). 

(a) Show that u(f) — da(f) is the Fourier transform of the function x,4(t) that is 0 from 
—A toA and equal to u(t elsewhere. 


(b) Argue that since [°S. |u(t)|?dt is finite, the integral [°° |a4(¢)|? dt must go to 0 as A > 
oo. Use whatever level = AU ines care and common sense that you feel comfortable 
with. 
(c) Using the energy equation (4.45), argue that 
(oe) 
lim |a(f) — da(f)/? dt = 0. 
A—oo as 


Note: This is only the easy part of the Plancherel theorem. The difficult part is to show 
the existence of i(f). The limit as A — oo of the integral ian u(t)e—2"F# dt need not exist 
for all f, and the point of the Plancherel theorem is to forget about this limit for individual 
f and focus instead on the energy in the difference between the hypothesized u(f) and the 
approximations. 

4.26. (£2 functions) Assume that {u(t) : R — C} and {v(t) : R — C} are Le and that a and b 
are complex numbers. Show that au(t) + bu(t) is Lo. For T > 0, show that u(t — T) and 
u(+) are Lo functions. 

4.27. (Relation of Fourier series to Fourier integral) Assume that {u(t) : [-T/2, T/2] — C} is 
£2. Without being very careful about the mathematics, the Fourier series expansion of 
{u(t)} is given by 


£ 
u(t) = jim u(t) where u(t) = Ss" iper*I roet( =) 
ae k=- 
T/2 
tig z/ Hen ae 
ib. 
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(a) Does the above limit hold for all t € [—T’/2, T/2]? If not, what can you say about the 
type of convergence? 


b) Does the Fourier transform a(f) = ie 7 u(t)e2"*F¢ dt exist for all f? Explain. 

T/2 
(c) The Fourier transform of the finite sum u(t) is @(f) = Seay aT sinc( fT —k). In 
the limit £— 00, &(f) = limp... 6 (f), so 


£ 


a(f) = lim S> aT sinc( fT — k). 
= 


loo 
£ 


Give a brief explanation why this equation must hold with equality for all f € R. Also show 
that {u(f) : f € R} is completely specified by its values, {ti(k/T) : k € Z} at multiples of 
1/T. 


4.28. (sampling) One often approximates the value of an integral by a discrete sum; i.e., 


i. g(t) dt 5S g(kd). 
ee k 


(a) Show that if u(t) is a real finite-energy function, low-pass limited to W Hz, then the 
above approximation is exact for g(t) = u?(t) if 6 < 1/(2W); i.e., show that 


ia u(t) dt =6S— u?(ké). 
a k 


(b) Show that if g(t) is a real finite-energy function, low-pass limited to W Hz, then for 
6 <1/(2W), 


‘a g(t) dt = 5S © g(ké). 
ee k 


(c) Show that if 6 > 1/2W, then there exists no such relation in general. 


4.29. (degrees of freedom) This exercise explores how much of the energy of a baseband-limited 
function {u(t) : [-1/2,1/2] — R} can reside outside the region where the sampling coef- 
ficients are nonzero. Let T = 1/(2W) = 1 and let n be a given positive even integer. Let 
ux = (—1)* for —n < k <n and ug = 0 for |k| > n. Show that |u(n+ 5)| increases without 
bound as the end point n is increased. Show that |u(n +m + $)| > |u(n —m — $)| for all 
integer m, 0 < m <n. In other words, shifting the sample points by 1/2 leads to most of 
the sample point energy being outside the interval [—n, n]. 


4.30. (sampling theorem for [A — W,A+W)]) (a) Verify the Fourier transform pair in (4.70). 
Hint: Use the scaling and shifting rules on rect(f) < sinc(t). 


(b) Show that the functions making up that expansion are orthogonal. Hint: Show that 
the corresponding Fourier transforms are orthogonal. 


(c) Show that the functions in (4.74) are orthogonal. 
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4.31. (Amplitude limited functions) Sometimes it is important to generate baseband waveforms 
with bounded amplitude. This problem explores pulse shapes that can accomplish this 
(a) Find the Fourier transform of g(t) = sinc?(Wt). Show that g(t) is bandlimited to 
f < W and sketch both g(t) and g(f). (Hint: Recall that multiplication in the time 
domain corresponds to convolution in the frequency domain.) 

(b) Let u(t) be a continuous real £2 function baseband limited to f < W (i.e., a function 
such that u(t) = 5°, u(kT)sinc (t/T — k) where T = 1/2W). Let u(t) = u(t) * g(t). Express 
v(t) in terms of the samples {u(kT);k € Z} of u(t) and the shifts {g(t — kT);k € Z} of 
g(t). Hint: Use your sketches in part (a) to evaluate g(t) * sinc(t/T). 

(c) Show that if the T-spaced samples of u(t) are non-negative, then v(t) > 0 for all t. 

(d) Explain why 5°, sinc(t/T — k) = 1 for all t. 

(e) Using (d), show that 5°, g(t — kT’) = c for all t and find the constant c. Hint: Use the 
hint in (b) again. 

(f) Now assume that u(t), as defined in part (b), also satisfies u(kT) < 1 for all k € Z. 
Show that v(t) < 2 for all t. 

(g) Allow u(t) to be complex now, with |u(kT’)| < 1. Show that |v(t)| < 2 for all t. 


4.32. (Orthogonal sets) The function rect(t/T) has the very special property that it, plus its time 
and frequency shifts, by kT and j/T respectively, form an orthogonal set. The function 
sinc(t/T’) has this same property. We explore other functions that are generalizations 
of rect(t/T) and which, as you will show in parts (a) to (d), have this same interesting 
property. For simplicity, choose T = 1. 

These functions take only the values 0 and 1 and are allowed to be non-zero only over [-1, 
1] rather than [—1/2,1/2] as with rect(t). Explicitly, the functions considered here satisfy 
the following constraints: 


p(t) = p(t) for allt (0/1 property) (4.108) 
ple. «0 for |t| > 1 (4.109) 
p(t) = p(t) for allt (symmetry) (4.110) 
p(t) = 1—p(t-1)  for0<t<1/2. (4.111) 


Note: Because of property (4.110), condition (4.111) also holds for 1/2 < t < 1. Note also 
that p(t) at the single points t = +1/2 does not effect any orthogonality properties, so you 
are free to ignore these points in your arguments. 


1 another choice 
rect(t) of p(t) that 
satisfies (1) to (4). 
| | | | 
—1/2 1/2 -1 -1/2 0 1/2 1 


(a) Show that p(t) is orthogonal to p(t—1). Hint: evaluate p(t)p(t—1) for each t € [0,1] 
other than t = 1/2. 


(b) Show that p(t) is orthogonal to p(t—k) for all integer k 4 0. 
(c) Show that p(t) is orthogonal to p(t—k)e’2""™ for integer m 4 0 and k ¥ 0. 


(d) Show that p(t) is orthogonal to p(t)e?™’* for integer m #4 0. Hint: Evaluate 
pie: nie ee), 
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(e) Let h(t) = p(t) where f(f) is the Fourier transform of p(t). If p(t) satisfies properties 
(1) to (4), does it follow that h(t) has the property that it is orthogonal to h(t — k)e?™"”" 
whenever either the integer & or m is non-zero? 


Note: Almost no calculation is required in this problem. 


4.33. (limits) Construct an example of a sequence of £L» functions vu”) (t),m € Z,m > 0 such that 
lim v”)(t) = 0 for all t but for which Li.m. v”)(t) does not exist. In other words show 
m—- co m—- co 


that pointwise convergence does not imply £2 convergence. Hint: Consider time shifts. 


4.34. (aliasing) Find an example where w(f) is 0 for |f| > 3W and nonzero for W < |f| < 3W 
but where, with T = 1/(2W), s(kT’) = vo(kT) (as defined in (4.77)) for all k € Z). Hint: 
Note that it is equivalent to achieve equality between §(f) and ui(f) for |f| <W. Look at 
Figure 4.10. 

4.35. (aliasing) The following exercise is designed to illustrate the sampling of an approximately 
baseband waveform. To avoid messy computation, we look at a waveform baseband-limited 
to 3/2 which is sampled at rate 1 (i.e., sampled at only 1/3 the rate that it should be 
sampled at). In particular, let u(t) = sinc(3t). 

(a) Sketch ui(f). Sketch the function 6,,(f) = rect(f —_m) for each integer m such that 
Um(f) #0. Note that a(f) = do, dm(f). 

(b) Sketch the inverse transforms v,,(t) (real and imaginary part if complex). 

(c) Verify directly from the equations that u(t) = }>vm(t). Hint: this is easiest if you 
express the sine part of the sinc function as a sum of complex exponentials. 

(d) Verify the sinc-weighted sinusoid expansion, (4.73). (There are only 3 nonzero terms 
in the expansion.) 

(e) For the approximation s(t) = u(0)sinc(¢), find the energy in the difference between u(t) 
and s(t) and interpret the terms. 


4.36. (aliasing) Let u(t) be the inverse Fourier transform of a function ti(f) which is both 2; and 
Lo. Let Um(t) = f a(f)rect(fT—m)e?™Ft df and let v(t) = 1", Um(t). 
(a) Show that |u(t)— v™(4)| < fipsenaiyr l@(f)| of and thus that u(t) = limp. v(t) 
for all t. 
(b) Show that the sinc-weighted sinusoid expansion of (4.76) then converges pointwise for 
all t. Hint: for any t and any € > 0, choose n so that |u(t) — v"(t)| < ¢/2. Then for each 
m, |m| <n, expand v,,(t) in a sampling expansion using enough terms to keep the error 
less than 745. 

4.37. (aliasing) (a) Show that §(f) in (4.83) is £1 if u(f) is. 
(b) Let a(f) = ono rect [k?(f —k)]. Show that a(f) is £1 and Ly. Let T = 1 for 8(f) and 
show that S(f) is not £2. Hint: Sketch ai(f) and &(f). 
(c) Show that ai(f) does not satisfy lim)... (f)|f|'T* = 0. 

4.38. (aliasing) Let u(t) = dipx0 rect[k?(t — k)] and show that u(t) is La. Find s(t) = 
>>, u(k)sine(t — k) and show that it is neither 2; nor Ly. Find 5°, u?(k) and explain 
why the sampling theorem energy equation (4.66) does not apply here. 
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Chapter 5 


Vector spaces and signal space 


In the previous chapter, we showed that any £2 function u(t) can be expanded in various orthog- 
onal expansions, using such sets of orthogonal functions as the T-spaced truncated sinusoids or 
the sinc-weighted sinusoids. Thus u(t) may be specified (up to £2 equivalence) by a countably 
infinite sequence such as {ug,m}; —co < k,m < oo} of coefficients in such an expansion. 


In engineering, n-tuples of numbers are often referred to as vectors, and the use of vector notation 
is very helpful in manipulating these n-tuples. The collection of n-tuples of real numbers is called 
R” and that of complex numbers C”. It turns out that the most important properties of these 
n-tuples also apply to countably infinite sequences of real or complex numbers. It should not 
be surprising, after the results of the previous sections, that these properties also apply to La 
waveforms. 


A vector space is essentially a collection of objects (such as the collection of real n-tuples) along 
with a set of rules for manipulating those objects. There is a set of axioms describing precisely 
how these objects and rules work. Any properties that follow from those axioms must then 
apply to any vector space, i.e., any set of objects satisfying those axioms. R” and C” satisfy 
these axioms, and we will see that countable sequences and £2 waveforms also satisfy them. 


Fortunately, it is just as easy to develop the general properties of vector spaces from these 
axioms as it is to develop specific properties for the special case of R” or C” (although we will 
constantly use R” and C” as examples). Fortunately also, we can use the example of R” (and 
particularly R?) to develop geometric insight about general vector spaces. 


The collection of £2 functions, viewed as a vector space, will be called signal space. The signal- 
space viewpoint has been one of the foundations of modern digital communication theory since 
its popularization in the classic text of Wozencraft and Jacobs[35]. 


The signal-space viewpoint has the following merits: 


e Many insights about waveforms (signals) and signal sets do not depend on time and fre- 
quency (as does the development up until now), but depend only on vector relationships. 


e Orthogonal expansions are best viewed in vector space terms. 


e Questions of limits and approximation are often easily treated in vector space terms. It is 
for this reason that many of the results in Chapter 4 are proved here. 
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5.1 The axioms and basic properties of vector spaces 


A vector space V is aset of elements, v € Y, called vectors, along with a set of rules for operating 
on both these vectors and a set of ancillary elements called scalars. For the treatment here, the 
set F of scalars! will either be the set of real numbers R or the set of complex numbers C. A 
vector space with real scalars is called a real vector space, and one with complex scalars is called 
a complex vector space. 


The most familiar example of a real vector space is R”. Here the vectors are represented as 
n-tuples of real numbers.” R? is represented geometrically by a plane, and the vectors in R? by 
points in the plane. Similarly, R® is represented geometrically by three-dimensional Euclidean 
space. 


The most familiar example of a complex vector space is C”, the set of n-tuples of complex 
numbers. 


The axioms of a vector space VY are listed below; they apply to arbitrary vector spaces, and in 
particular to the real and complex vector spaces of interest here. 


1. Addition: For each v € VY and u € Y, there is a unique vector v + u € Y, called the sum 
of v and uw, satisfying 
(a) Commutativity: v+u=u+, 

(b) Associativity: v + (w+ w)=(v+u)+ w for each v,u,w € V. 

(c) Zero: There is a unique element 0 € Y satisfying v + 0 = v for all v € VY, 

(d) Negation: For each v € V, there is a unique —v € V such that v + (—v) = 0. 


2. Scalar multiplication: For each scalar* a and each v € V there is a unique vector av € VY 
called the scalar product of @ and v satisfying 
(a) Scalar associativity: a(Gv) = (a@3)v for all scalars a, 3, and all v € V, 
(b) Unit multiplication: for the unit scalar 1, lv = v for all v € V. 

3. Distributive laws: 


(a) For all scalars a and all v,ueV, a(v+u)=av+au; 
(b) For all scalars a,@ andallv EV, (a+ B)v=av+ fv. 


Example 5.1.1. For R", a vector v is an n-tuple (v1,...,U,) of real numbers. Addition is 
defined by v + w = (vj +U1,..., Un+Un). The zero vector is defined by 0 = (0,...,0). The 
scalars a are the real numbers, and av is defined to be (avj,...,Q@U,). This is illustrated 


geometrically in Figure 5.1.1 for R?. 


Example 5.1.2. The vector space C” is the same as R” except that v is an n-tuple of complex 
numbers and the scalars are complex. Note that C? can not be easily illustrated geometrically, 


since a vector in C? is specified by 4 real numbers. The reader should verify the axioms for both 
R” and C”. 


‘More generally, vector spaces can be defined in which the scalars are elements of an arbitrary field. It is not 
necessary here to understand the general notion of a field. 

?Many people prefer to define R” as the class of real vector spaces of dimension n, but almost everyone 
visualizes R” as the space of n-tuples. More importantly, the space of n-tuples will be constantly used as an 
example and R” is a convenient name for it. 

3 Addition, subtraction, multiplication, and division between scalars is done according to the familiar rules of 
R or C and will not be restated here. Neither R nor C includes oo. 
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Vectors are represented by 
points or directed lines. 


The scalar multiple au and wu 
lie on the same line from 0. 


The distributive law says 
that triangles scale correctly. 


0 Uy 


Figure 5.1: Geometric interpretation of R?. The vector v = (v1, v2) is represented as a point 
in the Euclidean plane with abscissa v; and ordinate v2. It can also be viewed as the directed 
line from O to the point v. Sometimes, as in the case of w = u — v, a vector is viewed as 
a directed line from some nonzero point (v in this case) to another point wu. This geometric 
interpretation also suggests the concepts of length and angle, which are not included in the 
axioms. This is discussed more fully later. 


Example 5.1.3. There is a trivial vector space for which the only element is the zero vector 
0. Both for real and complex scalars, a0 = 0. The vector spaces of interest here are non-trivial 
spaces, 7.e., spaces with more than one element, and this will usually be assumed without further 
mention. 


Because of the commutative and associative axioms, we see that a finite sum }/,aj;vj, where 
each a; is a scalar and vj; a vector, is unambiguously defined without the need for parentheses. 
This sum is called a linear combination of the vectors {v;}. 


We next show that the set of finite-energy complex waveforms can be viewed as a complex vector 
space.* When we view a waveform v(t) as a vector, we denote it by v. There are two reasons for 
this: first, it reminds us that we are viewing the waveform is a vector; second, u(t) sometimes 
denotes a function and sometimes denotes the value of that function at a particular argument 
t. Denoting the function as v avoids this ambiguity. 


The vector sum v + u is defined in the obvious way as the waveform for which each t is mapped 
into v(t) + u(t); the scalar product av is defined as the waveform for which each t is mapped 
into av(t). The vector 0 is defined as the waveform that maps each t into 0. 


The vector space axioms are not difficult to verify for this space of waveforms. To show that the 
sum v + wu of two finite energy waveforms v and uw also has finite energy, recall first that the 
sum of two measurable waveforms is also measurable. Next, recall that if v and u are complex 
numbers, then |v + ul? < 2|v|? + 2|u|?. Thus, 


ie lo(t) + u(t)|? dt < y alu(t)|? dt + a a}u(t)|2 dt < oo. (5.1) 


—oo — 


Similarly, if v has finite energy, then av has |a|? times the energy of v which is also finite. The 
other axioms can be verified by inspection. 


The above argument has shown that the set of finite-energy waveforms, along with the definitions 
of addiition and complex scalar multiplication, form a complex vector space. The set of real 


“There is a small but important technical difference between the vector space being defined here and what we 
will later define to be the vector space £2. This difference centers on the notion of £2 equivalence, and will be 
discussed later. 
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finite-energy waveforms along with the analogous addition and real scalar multiplication form a 
real vector space. 


5.1.1 Finite-dimensional vector spaces 


A set of vectors v1,...,Un € V spans V (and is called a spanning set of V) if every vector 
v € V is a linear combination of v1,...,Un. For the R” example, let e; = (1,0,0,... ,0), 
e2 = (0,1,0,... ,0),..., en = (0,...0,1) be the n unit vectors of R". The unit vectors span R” 


since every vector v € R” can be expressed as a linear combination of the unit vectors, 7.e., 
n 
v= (ase An) = ) Ajj. 
j=l 


A vector space Y is finite-dimensional if a finite set of vectors u1,... , Un exist that span V. Thus 
R” is finite-dimensional since it is spanned by e1,... , @n. Similarly, C” is finite-dimensional. If 
Y is not finite-dimensional, then it is infinite-dimensional; we will soon see that £2 is infinite- 
dimensional. 


A set of vectors, ¥1,...,Un € V is linearly dependent if SA a;v; = 0 for some set of scalars 
not all equal to 0. This implies that each vector vz for which a, 4 0 is a linear combination of 
the others, 7.e., 


A set of vectors v1,...,Un € V is linearly independent if it is not linearly dependent, i.e., if 
ae a,;v; =0 implies that each a; is 0. For brevity we often omit the word “linear” when we 
refer to independence or dependence. 


It can be seen that the unit vectors e1,...,@n, as elements of R”, are linearly independent. 
Similarly, they are linearly independent as elements of C”, 


A set of vectors v1,...,Un € V is defined to be a basis for V if the set both spans V and is 
linearly independent. Thus the unit vectors e1,... , en form a basis for R”. Similarly, the unit 
vectors, as elements of C”, form a basis for C”. 


The following theorem is both important and simple; see Exercise 5.1 or any linear algebra text 
for a proof. 

Theorem 5.1.1 (Basis for finite-dimensional vector space). Let V be a non-trivial finite- 
dimensional vector space.° Then 


e If vy,...,Um span V but are linearly dependent, then a subset of v1,...,Um forms a basis 
for V with n <_m vectors. 


e If U,...,Um are linearly independent but do not span Y, then there exists a basis for V 
with n > m vectors that includes v1,... ,Um- 


e Every basis of V contains the same number of vectors. 


>The trivial vector space whose only element is 0 is conventionally called a zero-dimensional space and could 
be viewed as having an empty basis. 
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The dimension of a finite-dimensional vector space may now be defined as the number of vectors 
in a basis. The theorem implicitly provides two conceptual algorithms for finding a basis. First, 
start with any linearly independent set (such as a single nonzero vector) and successively add 
independent vectors until reaching a spanning set. Second, start with any spanning set and 
successively eliminate dependent vectors until reaching a linearly independent set. 


Given any basis, v1,...,Un, for a finite-dimensional vector space Y, any vector v € VY can be 
represented as 


n 
o= ) C055 where Q1,... ,@,,are scalars. (5.2) 
j=l 


In terms of the given basis, each v € Y can be uniquely represented by the n-tuple of coefficients 
(Q1,.-.,Q@n,) in (5.2). Thus any n-dimensional vector space Y over R or C may be viewed 
(relative to a given basis) as a version® of R” or C”. This leads to the elementary vector/matrix 
approach to linear algebra. What is gained by the axiomatic (“coordinate-free” ) approach is the 
ability to think about vectors without first specifying a basis. We see the value of this shortly 
when we define subspaces and look at finite-dimensional subspaces of infinite-dimensional vector 
spaces such as £o. 


5.2 Inner product spaces 


The vector space axioms above contain no inherent notion of length or angle, although such 
geometric properties are clearly present in Figure 5.1.1 and in our intuitive view of R” or C”. 
The missing ingredient is that of an inner product. 


An inner product on a complex vector space VY is a complex-valued function of two vectors, 
v,u € V, denoted by (v, u), that satisfies the following axioms: 


(a) Hermitian symmetry: (v,u) = (u,v)*; 
(b) Hermitian bilinearity: (av + Bu, w) = a(v, w) + B(u, w) 
(and consequently (v,au + Bw) = a*(v,u) + B*(v, w)); 
(c) Strict positivity: (v,v) > 0, with equality if and only if v = 0. 
A vector space with an inner product satisfying these axioms is called an inner product space. 


The same definition applies to a real vector space, but the inner product is always real and the 
complex conjugates can be omitted. 


The norm or length ||v|| of a vector v in an inner product space is defined as 


AC 


Two vectors v and wu are defined to be orthogonal if (v, wu) = 0. Thus we see that the important 
geometric notions of length and orthogonality are both defined in terms of the inner product. 


®More precisely Y and R” (C”) are isomorphic in the sense that that there is a one-to one correspondence 
between vectors in V and n-tuples in R” (C”) that preserves the vector space operations. In plain English, solvable 
problems concerning vectors in Y can always be solved by first translating to n-tuples in a basis and then working 
in R” or C”. 
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5.2.1 The inner product spaces R” and C” 


For the vector space R” of real n-tuples, the inner product of vectors v = (v1,...Un) and 
u = (U1,.-., Un) is usually defined (and is defined here) as 
n 
(v,u) = 3 UjUj 
j=l 


You should verify that this definition satisfies the inner product axioms above. 


The length ||v|| of a vector v is then ,/>?; ve, which agrees with Euclidean geometry. Recall 


that the formula for the cosine between two arbitrary nonzero vectors in R? is given by 


FE Ce ee a (5.3) 
Vue +3 fue tu? — loll |iell 
where the final equality also expresses this as an inner product. Thus the inner product de- 
termines the angle between vectors in R?. This same inner product formula will soon be seen 
to be valid in any real vector space, and the derivation is much simpler in the coordinate free 
environment of general vector spaces than in the unit vector context of R?. 


For the vector space C” of complex n-tuples, the inner product is defined as 


(v,U) = So ojus (5.4) 
j=l 


The norm, or length, of v is then di lv, = di [R(v;)? + S(v;)?]. Thus, as far as length is 
concerned, a complex n-tuple u can be regarded as the real 2n-vector formed from the real and 
imaginary parts of u. Warning: although a complex n-tuple can be viewed as a real 2n—tuple 
for some purposes, such as length, many other operations on complex n-tuples are very different 
from those operations on the corresponding real 2n-tuple. For example, scalar multiplication 
and inner products in C” are very different from those operations in R?”. 


5.2.2 One-dimensional projections 


An important problem in constructing orthogonal expansions is that of breaking a vector v 
into two components relative to another vector wu # O in the same inner-product space. One 
component, Viu, is to be orthogonal (7.e., perpendicular) to u and the other, v),, is to be 
collinear with wu (two vectors v),, and wu are collinear if v),, = au for some scalar a). Figure 5.2 
illustrates this decomposition for vectors in R?. We can view this geometrically as dropping a 


perpendicular from v to u. From the geometry of Figure 5.2, ||v),|| = || v|] cos(Z(v, u)). Using 
(5.3), ||)ul| = (v, u)/||u||. Since v),, is also collinear with wu, it can be seen that 
(v, u) 


The vector vj, is called the projection of v onto u. 


Rather surprisingly, (5.5) is valid for any inner product space. The general proof that follows is 
also simpler than the derivation of (5.3) and (5.5) using plane geometry. 
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v = (V1, V2) u = (uy, U2) 
Lu 
U2 
Vu 
a 
0 U1 


Figure 5.2: Two vectors, v = (v,,v2) and u = (u1, uz) in R?. Note that |/u|]? = (u,u) = 
uj + uj is the squared length of wu. The vector v is also expressed as v = Uj, + Vy where 
V| is collinear with u and v1 is perpendicular to wu. 


Theorem 5.2.1 (One-dimensional projection theorem). Let v and u be arbitrary vectors 
with u# 0 in a real or complex inner product space. Then there is a unique scalar a for which 
(v—au,u) =0. That a is given by a = (v, u)/||ull?. 


Remark: The theorem states that v— au is perpendicular to wu if and only if a = (v, u)/||ull?. 
Using that value of a, v — au is called the perpendicular to wu and is denoted as v_ 4; similarly 
au is called the projection of v on u and is denoted as u),. Finally, v = viy + Vj\y, so v has 
been split into a perpendicular part and a collinear part. 


Proof: Calculating (v — au, wu) for an arbitrary scalar a, the conditions can be found under 
which this inner product is zero: 


(v = au, U) = (v, u) =; au, u) = (v, u) a a||u||*, 


which is equal to zero if and only if a = (v, u)/||ull?. 


The reason why ||w||? is in the denominator of the projection formula can be understood by 
rewriting (5.5) as 


U U 


) 


Uv = (WU 5 
v= TT Tal 


In words, the projection of v on u is the same as the projection of v on the normalized version 
of wu. More generally, the value of v),, is invariant to scale changes in wu, 7.e., 


(v, Bu) 


_ (v, u) 
“lou = Bull? 


Bu= U= Vy. (5.6) 


This is clearly consistent with the geometric picture in Figure 5.2 for R?, but it is also valid for 
complex vector spaces where such figures cannot be drawn. 
In R?, the cosine formula can be rewritten as 
uieou 
cos(Z(u, v)) = (_—,-__). (5.7) 
alee kal 

That is, the cosine of Z(wu,v) is the inner product of the normalized versions of wu and v. 
Another well known result in R? that carries over to any inner product space is the Pythagorean 
theorem: If v and w are orthogonal, then 


|v + wll? = [lvl]? + [ell?. (5.8) 
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To see this, note that 

(u+u,u+u) = (v,v) + (v,u) + (u,v) + (u, u). 
The cross terms disappear by orthogonality, yielding (5.8). 
Theorem 5.2.1 has an important corollary, called the Schwarz inequality: 


Corollary 5.2.1 (Schwarz inequality). Let v and wu be vectors in a real or complex inner 
product space. Then 


I(v, w)| < |Jol| lull. (5.9) 


Proof: Assume u # 0 since (5.9) is obvious otherwise. Since v),, and vj are orthogonal, (5.8) 
shows that 


loll? = [ojall? + oral? 
Since ||v14||? is nonnegative, we have 


2 
(v, u) || ||? = |(v, u)|? 
oa 7 
|| || 


lly? = 
|||]? 


l|>ull? = 


which is equivalent to (5.9). 


For v and u both nonzero, the Schwarz inequality may be rewritten in the form 
vu 
( | <1 
| [loll lel 
In R?, the Schwarz inequality is thus equivalent to the familiar fact that the cosine function is 
upperbounded by 1. 
As shown in Exercise 5.6, the triangle inequality below is a simple consequence of the Schwarz 


inequality. 


Iv + ull < [loll + [lel (5.10) 


5.2.3 The inner product space of £2 functions 


Consider the set of complex finite energy waveforms again. We attempt to define the inner 
product of two vectors v and uw in this set as 


ne [- v(t)u* (t)dt. (5.11) 


It is shown in Exercise 5.8 that (v, wu) is always finite. The Schwarz inequality cannot be used 
to prove this, since we have not yet shown that this satisfies the axioms of an inner product 
space. However, the first two inner product axioms follow immediately from the existence and 
finiteness of the inner product, 7.e., the integral in (5.11). This existence and finiteness is a vital 
and useful property of Lo. 


The final inner product axiom is that (v,v) > 0, with equality if and only if v = 0. This axiom 
does not hold for finite-energy waveforms, because as we have already seen, if a function v(t) is 
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zero almost everywhere, then its energy is 0, even though the function is not the zero function. 
This is a nit-picking issue at some level, but axioms cannot be ignored simply because they are 
inconvenient. 


The resolution of this problem is to define equality in an £2 inner product space as £2-equivalence 
between £2 functions. What this means is that each element of an £2 inner product space is 
the equivalence class of £2 functions that are equal almost everywhere. For example, the zero 
equivalence class is the class of zero-energy functions, since each is £2 equivalent to the all-zero 
function. With this modification, the inner product axioms all hold. 


Viewing a vector as an equivalence class of £2 functions seems very abstract and strange at 
first. From an engineering perspective, however, the notion that all zero-energy functions are 
the same is more natural than the notion that two functions that differ in only a few isolated 
points should be regarded as different. 


From a more practical viewpoint, it will be seen later that £2 functions (in this equivlence 
class sense) can be represented by the coefficients in any orthogonal expansion whose elements 
span the £2 space. Two ordinary functions have the same coefficients in such an orthogonal 
expansion if and only if they are £2 equivalent. Thus each element u of the £2 inner product 
space is in one-to-one correspondence to a finite-energy sequence {uz; k € Z} of coefficients in an 
orthogonal expansion. Thus we can now avoid the awkwardness of having many £2 equivalent 
ordinary functions map into a single sequence of coefficients and having no very good way of 
going back from sequence to function. Once again engineering common sense and sophisticated 
mathematics come together. 


From now on we will simply view £2 as an inner product space, referring to the notion of Lo 
equivalence only when necessary. With this understanding, we can use all the machinery of 
inner product spaces, including projections and the Schwarz inequality. 


5.2.4 Subspaces of inner product spaces 


A subspace S of a vector space V is a subset of the vectors in Y which forms a vector space in 
its own right (over the same set of scalars as used by VY). An equivalent definition is that for all 
v and u € S, the linear combination av + Zu is in S for all scalars a and @. If V is an inner 
product space, then it can be seen that S is also an inner product space using the same inner 
product definition as VY. 


Example 5.2.1 (Subspaces of R*). Consider the real inner product space R*®, namely the 
inner product space of real 3-tuples v = (v1, v2, v3). Geometrically, we regard this as a space 
in which there are three orthogonal coordinate directions, defined by the three unit vectors 
€1, €2, €3. The 3-tuple v1, v2, v3 then specifies the length of v in each of those directions, so that 
VU = Vj e1 + U2QE2 + UZ E3. 

Let u = (1,0,1) and w = (0, 1,1) be two fixed vectors, and consider the subspace of R? composed 
of all linear combinations, v = au + Gw, of wu and w. Geometrically, this subspace is a plane 
going through the points 0,u, and w. In this plane, as in the original vector space, u and w 
each have length V2 and (u, w) = 1. 


Since neither uw nor w is a scalar multiple of the other, they are linearly independent. They 
span S by definition, so S is a two-dimensional subspace with a basis {wu, w}. 


The projection of u on w is uj, = (0,1/2, 1/2), and the perpendicular is uj y = (1,—1/2,1/2). 
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These vectors form an orthogonal basis for S. Using these vectors as an orthogonal basis, we 
can view S, pictorially and geometrically, in just the same way as we view vectors in R?. 


Example 5.2.2 (General 2D subspace). Let V be an arbitrary non-trivial real or complex 
inner product space, and let uw and w be arbitrary noncollinear vectors. Then the set S of linear 
combinations of u and w is a two-dimensional subspace of V with basis {u, w}. Again, uj, 
and wiy forms an orthogonal basis of S. We will soon see that this procedure for generating 
subspaces and orthogonal bases from two vectors in an arbitrary inner product space can be 
generalized to orthogonal bases for subspaces of arbitrary dimension. 


Example 5.2.3 (IR? is a subset but not a subspace of C”). Consider the complex vector 
space C?. The set of real 2-tuples is a subset of C?, but this subset is not closed under multi- 
plication by scalars in C. For example, the real 2-tuple u = (1,2) is an element of C? but the 
scalar product iu is the vector (i, 27), which is not a real 2-tuple. More generally, the notion of 
linear combination (which is at the heart of both the use and theory of vector spaces) depends 
on what the scalars are. 


We cannot avoid dealing with both complex and real £2 waveforms without enormously compli- 
cating the subject (as a simple example, consider using the sine and cosine forms of the Fourier 
transform and series). We also cannot avoid inner product spaces without great complication. 
Finally we cannot avoid going back and forth between complex and real £2 waveforms. The 
price of this is frequent confusion between real and complex scalars. The reader is advised to 
use considerable caution with linear combinations and to be very clear about whether real or 
complex scalars are involved. 


5.3 Orthonormal bases and the projection theorem 


In an inner product space, a set of vectors @1, @o,... is orthonormal if 
0 for jfk 
; = 2 
(95; Px) { 1. for j — k. (5 ) 
In other words, an orthonormal set is a set of nonzero orthogonal vectors where each vector is 
normalized to unit length. It can be seen that if a set of vectors u 1, u2,... is orthogonal, then 
the set 
1 
o; = 4; 
7 |fesll 7 


is orthonormal. Note that if two nonzero vectors are orthogonal, then any scaling (including 
normalization) of each vector maintains orthogonality. 


If a vector v is projected onto a normalized vector @, then the one-dimensional projection 
theorem states that the projection is given by the simple formula 


Vig = (v, )d. (5.13) 


Furthermore, the theorem asserts that vig = v — v)g is orthogonal to @. We now generalize the 
Projection Theorem to the projection of a vector v € V onto any finite dimensional subspace S 
of V. 
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5.3.1 Finite-dimensional projections 


If S is a subspace of an inner product space Y, and v € VY, then a projection of v on S is defined 
to be a vector vjs € S such that v — vig is orthogonal to all vectors in S. The theorem to follow 
shows that vjs always exists and has a unique value given in the theorem. 


The earlier definition of projection is a special case of that here in which S is taken to be the 
one dimensional subspace spanned by a vector u (the orthonormal basis is then ¢ = u/||ul]). 


Theorem 5.3.1 (Projection theorem). Let S be an n-dimensional subspace of an inner 
product space V and assume that {1,@o,.-.,P,} is an orthonormal basis for S. Then for 
any vu € V, there is a unique vector ug € S such that (v— ug ,s) =0 for all s © S. Further- 
more, Ys is given by 


n 


ys = >_(v, b;)G;- (5.14) 


j=l 


Remark: The theorem assumes that S has a set of orthonormal vectors as a basis. It will be 
shown later that any non-trivial finite-dimensional inner product space has such an orthonormal 
basis, so that the assumption does not restrict the generality of the theorem. 


Proof: Let w = iat aj, be an arbitrary vector in S. First consider the conditions on w 


under which v — w is orthogonal to all vectors s € S. It can be seen that v — w is orthogonal 
to all s € S if and only if 
(v—w,@;) = 0, forall J, 17S, 


or equivalently if and only if 


(v,;) = (w,;), for all a; 1 < J < nN. (5.15) 
Since w = )“p_, ard , 
(w,;) = >- ae(be, 6) =0;, forall j, 1<j<n. (5.16) 
(=1 


Combining this with (5.15), v — w is orthogonal to all s € S if and only if aj = (v, b;) for each 
j, u.e., if and only if w = do (¥, Dj) hj: Thus vs as given in (5.14) is the unique vector w € S 
for which v — vjg is orthogonal to all s € S. 


The vector v — vj) is denoted as vs, the perpendicular from v to S. Since vig € S, we see that 
vis and vgs are orthogonal. The theorem then asserts that v can be uniquely split into two 
orthogonal components, v = vjg + vis where the projection v)s is in S and the perpendicular 
vis is orthogonal to all vectors s € S. 


5.3.2 Corollaries of the projection theorem 


There are three important corollaries of the projection theorem that involve the norm of the 


projection. First, for any scalars a1,... ,Q@n, the squared norm of w = = aj@,; is given by 
n n n 
2 
|| w ||? = (w, S > a5)) = So as (w, d;) = Ss" a5 ) 
j=l j=l j=l 
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where (5.16) has been used in the last step. For the projection vs, aj = (v, b;), 80 


l|ysI? = Die (5.17) 


Also, since v = vig +s and Ug is orthogonal to vs, It follows from the Pythagorean theorem 
(5.8) that 


loll? = llojsll? + ossll?. (5.18) 
Since ||v1s5||? > 0, the following corollary has been proven: 
Corollary 5.3.1 (norm bound). 
0 < |[ys|I? < |e, (5.19) 


with equality on the right if and only if v € S and equality on the left if and only if v is orthogonal 
to all vectors in S. 


Substituting (5.17) into (5.19), we get Bessel’s inequality, which is the key to understanding the 
convergence of orthonormal expansions. 


Corollary 5.3.2 (Bessel’s inequality). Let S C V be the subspace spanned by the set of or- 
thonormal vectors {@,...,0@,}. For anyve VY 


0< Eke v, 5)? < lll’, 


with equality on the right if and only if v € S and equality on the left if and only if v is orthogonal 
to all vectors in S. 


Another useful characterization of the projection vig is that it is the vector in S that is closest 
to v. In other words, using some s € S as an approximation to v, the squared error is ||v — s||?. 
The following corollary says that v)g is the choice for s that yields the least squared error (LS). 


Corollary 5.3.3 (LS error property). The projection ujg is the unique closest vector in S 
to v; 1.e., forall se S, 
2 2 
|v — ysl" < |lv— sll’, 


with equality if and only if s= us. 


Proof: Decomposing v into vjg + vis, we have v — s = [vjg — 8] + vis. Since vig and s are 
in S, v5 — § is also in S, so by Pythagoras, 


|v — sl]? = |las — sll? + lvis|? = llousll?, 


with equality if and only if ||v)5 — s||? = 0, z.e., if and only if s = vis. Since vig = V— UI, 
this completes the proof. 
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5.3.3 Gram-Schmidt orthonormalization 


Theorem 5.3.1, the projection theorem, assumed an orthonormal basis {@1,...,@,,} for any 
given n-dimensional subspace S of VY. The use of orthonormal bases simplifies almost everything 
concerning inner product spaces, and for infinite-dimensional expansions, orthonormal bases are 
even more useful. 


This section presents the Gram-Schmidt procedure, which, starting from an arbitrary basis 
{s1,... , 8} for an n-dimensional inner product subspace S, generates an orthonormal basis for 
S. The procedure is useful in finding orthonormal bases, but is even more useful theoretically, 
since it shows that such bases always exist. In particular, since every n-dimensional subspace 
contains an orthonormal basis, the projection theorem holds for each such subspace. 


The procedure is almost obvious in view of the previous subsections. First an orthonormal basis, 
1 = 81/||s1]], is found for the one-dimensional subspace S; spanned by s;. Projecting s2 onto 
this one-dimensional subspace, a second orthonormal vector can be found. Iterating, a complete 
orthonormal basis can be constructed. 


In more detail, let (s2)| 5, be the projection of s2 onto S;. Since sz and s; are linearly indepen- 
dent, ($2) 1s, = $2 — (82))s, is nonzero. It is orthogonal to @, since @, € S}. It is normalized 
as dy = (82) 1s,/||(S$2)1s,||. Then 6, and @, span the space S2 spanned by s, and s9. 

Now, using induction, suppose that an orthonormal basis {@,... ,@;} has been constructed for 
the subspace S; spanned by {81,... , 8}. The result of projecting $41 onto S, is (s%41)\5, = 


Yh (Se+1,0;)b;. The perpendicular, (8441) 15, = $41 — (8k+1)js, is given by 


k 
(Sr4i) is, = Seti — >, (8p+1, b;)G;- (5.20) 
j=l 
This is nonzero since 8,4, is not in S; and thus not a linear combination of @,,... ,@,. Nor- 
malizing, 
(Sk+i)1s 
Prt Ber he RS (5.01%) 
Il (Sk-+1) 184 ll 

From (5.20) and (5.21), 8,41 is a linear combination of @1,...,@,41 and s1,... , 8% are linear 
combinations of @,,... ,,, SO @j,-.- , Pz, is an orthonormal basis for the space S;.1 spanned 
by S1,--- ,S8k4+1- 
In summary, given any n-dimensional subspace S with a basis {81,... , 8,}, the Gram-Schmidt 
orthonormalization procedure produces an orthonormal basis {@,,...,@,,} for S. 


Note that if a set of vectors is not necessarily independent, then the procedure will automatically 
find any vector s; that is a linear combination of previous vectors via the projection theorem. It 
can then simply discard such a vector and proceed. Consequently it will still find an orthonormal 
basis, possibly of reduced size, for the space spanned by the original vector set. 


5.3.4 Orthonormal expansions in L» 


The background has now been developed to understand countable orthonormal expansions in 
Ly. We have already looked at a number of orthogonal expansions, such as those used in the 
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sampling theorem, the Fourier series, and the T-spaced truncated or sinc-weighted sinusoids. 
Turning these into orthonormal expansions involves only minor scaling changes. 


The Fourier series will be used both to illustrate these changes and as an example of a general 
orthonormal expansion. The vector space view will then allow us to understand the Fourier 
series at a deeper level. Define @;(t) = errikt/Trect( 4) for k € Z. The set {0,(t);k € Z} of 
functions is orthogonal with ||@;,||? = T. The corresponding orthonormal expansion is obtained 


by scaling each 0; by ,/1/T; i.e., 


ox (t) = [Retr lPreot( Ey (5.22) 


The Fourier series of an £2 function {v(t) : |-T/2,T/2] — C} then becomes )°>, ax@x(t) where 
ap, = f{ v(t) dx (t) dt = (v, d;). For any integer n > 0, let S, be the (2n+1)-dimensional subspace 
spanned by the vectors {@,,—n < k < n}. From the projection theorem, the projection vjs,, of 
v on Sy, is 


n 


Vis, = S- (v, by) Px- 


k=—-n 


That is, the projection vs, is simply the approximation to v resulting from truncating the 
expansion to —n < k <n. The error in the approximation, vis, = v—js,,, is orthogonal to all 
vectors in S;,, and from the LS error property, vjs,, is the closest point in S,, to v. As n increases, 
the subspace S,, becomes larger and v)s,, gets closer to v (i.e., ||v — vjg,,|| is nonincreasing). 


As the analysis above applies equally well to any orthonormal sequence of functions, the general 
case can now be considered. The main result of interest is the following infinite-dimensional 
generalization of the projection theorem. 


Theorem 5.3.2 (Infinite-dimensional projection). Let {@,,, 1<m<oco} be a sequence of 
orthonormal vectors in Lz, and let v be an arbitrary Ly vector. Then there exists a wnique’ Lo 
vector u such that v— u is orthogonal to each },, and 


n 
Jim, ba Yo nd 
m= 
ul? = ST lanl. (5.24) 


0 where Am = (v, Pp) (5.23) 


Conversely, for any complex sequence {Om; 1<m<oo} such that >, |ag|? < co, an Lo function 
u exists satisfying (5.23) and (5.24) 


Remark: This theorem says that the orthonormal expansion }>,, am@,, converges in the Lo 
sense to an £2 function u, which we later interpret as the projection of v onto the infinite- 
dimensional subspace S spanned by {@,,,, 1<m<oo}. For example, in the Fourier series case, 
the orthonormal functions span the subspace of £2 functions time-limited to [-T/2,T/2], and u 
is then v(t) rect(#). The difference v(t) —v(t) rect(+) is then £2 equivalent to 0 over [—T'/2, T/2] 
and thus orthogonal to each @,,,. 


"Recall that the vectors in the £2 class of functions are equivalence classes, so this uniqueness specifies only 
the equivalence class and not an individual function within that class. 
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Proof: Let S,, be the subspace spanned by {@,,... , @,,}. From the finite-dimensional projection 
theorem, the projection of v on S, is then vjs,, = yo p-1 OkPy. From (5.17), 


n 
las,7 = do lawl? — where ag = (v, x). (5.25) 
k=1 


This quantity is nondecreasing with n, and from Bessel’s inequality, it is upperbounded by || v||?, 
which is finite since v is Lo. It follows that for any n and any m > n, 


lism — ysl? = SS lanl? < So lag? "=F 0. (5.26) 


n<|k|<m |k|>n 


This says that the projections {vjs,,;n € Z*} approach each other as n — oo in terms of their 
energy difference. 

A sequence whose terms approach each other is called a Cauchy sequence. The Riesz-Fischer 
theorem® is a central theorem of analysis stating that any Cauchy sequence of £2 waveforms has 
an Lo limit. Taking w to be this £2 limit, i.e, u= Lim. v\s,, we have (5.23) and (5.24).° 


Essentially the same use of the Riesz-Fischer theorem establishes (5.23) and (5.24) starting with 
the sequence a1, @2,.... 


Let S be the space of functions (or, more precisely, of equivalence classes) that can be represented 
as lim. >>), a¢@,(t) over all sequences aj, a2,... such that >, |axz|? < oo. It can be seen that 
this is an inner product space. It is the space spanned by the orthonormal sequence {@;;k € Z}. 


The following proof of the Fourier series theorem illustrates the use of the infinite dimensional 
projection theorem and infinite dimensional spanning sets. 


Proof of Theorem 4.4.1: Let {vu(t) : [-T/2, T/2]] — C} be an arbitrary £2 function over 
[-—T/2, T/2]. We have already seen that v(t) is £1, that t, = x f v(tje 2miee/P dt exists and 
that |o,| < f|v()| dé for all k € Z. From Theorem 5.3.2, there is an Ly function u(t) = 
lim. >, dpe?™*/T rect (t/T) such that v(t) — u(t) is orthogonal to 6,(t) = e27**/Trect(t/T) for 
each k € Z. 


We now need an additional basic fact:!9 the above set of orthogonal functions {6,(t) = 
e27kt/Trect(t/T);k € Z} span the space of Ly functions over [—T/2, T/2], i.e., there is no 
function of positive energy over [—T'/2, T/2] that is orthogonal to each 0;(t). Using this fact, 
v(t) — u(t) has zero energy and is equal to 0 a.e. Thus v(t) = Lim. 37, 6,627"/Trect(t/T). The 
energy equation then follows from (5.24). The final part of the theorem follows from the final 
part of Theorem 5.3.2. 


As seen by the above proof, the infinite dimensional projection theorem can provide simple and 
intuitive proofs and interpretations of limiting arguments and the approximations suggested by 
those limits. The appendix uses this theorem to prove both parts of the Plancherel theorem, 
the sampling theorem, and the aliasing theorem. 


Another, more pragmatic, use of the theorem lies in providing a uniform way to treat all or- 
thonormal expansions. As in the above Fourier series proof, though, the theorem doesn’t nec- 


’See any text on real and complex analysis, such as Rudin[26]. 

°An inner product space in which all Cauchy sequences have limits is said to be complete, and is called a 
Hilbert space. Thus the Riesz-Fischer theorem states that £2 is a Hilbert space. 

10 Again, see any basic text on real and complex analysis. 
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essarily provide a simple characterization of the space spanned by the orthonormal set. Fortu- 
nately, however, knowing that the truncated sinusoids span [—T’/2, T'/2] shows us, by duality, 
that the T-spaced sinc functions span the space of baseband-limited £2 functions. Similarly, 
both the T-spaced truncated and the sinc-weighted sinusoids span all of Lo. 


5.4 Summary 


The theory of £2 waveforms, viewed as vectors in the inner product space known as signal space, 
has been developed. The most important consequence of this viewpoint is that all orthonormal 
expansions in £2 may be viewed in a common framework. The Fourier series is simply one 
example. 


Another important consequence is that, as additional terms are added to a partial orthonormal 
expansion of an £2 waveform, the partial expansion changes by increasingly small amounts, 
approaching a limit in £2. A major reason for restricting attention to finite-energy waveforms 
(in addition to physical reality) is that as their energy gets used up in different degrees of freedom 
(i.e., expansion coefficients), there is less energy available for other degrees of freedom, so that 
some sort of convergence must result. The £2 limit above simply make this intuition precise. 


Another consequence is the realization that if £2 functions are represented by orthonormal 
expansions, or approximated by partial orthonormal expansions, then there is no further need 
to deal with sophisticated mathematical issues such as £2 equivalence. Of course, how the 
truncated expansions converge may be tricky mathematically, but the truncated expansions 
themselves are very simple and friendly. 


5A Appendix: Supplementary material and proofs 


The first part of the appendix uses the inner-product results of this chapter to prove the theorems 
about Fourier transforms in Chapter 4. The second part uses inner-products to prove the 
theorems in Chapter 4 about sampling and aliasing. The final part discusses prolate spheroidal 
waveforms; these provide additional insight about the degrees of freedom in a time/bandwidth 
region. 


5A.1 The Plancherel theorem 


Proof of Theorem 4.5.1 (Plancherel 1): The idea of the proof is to expand the time- 
waveform u into an orthonormal expansion for which the partial sums have known Fourier 
transforms; the £2 limit of these transforms is then identified as the £5 transform 0 of wu. 


First expand an arbitrary £2 function u(t) in the T-spaced truncated sinusoid expansion, using 
T = 1. This expansion spans £2 and the orthogonal functions e?""rect(t — m) are orthonormal 
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since T = 1. Thus the infinite dimensional projection, as specified by Theorem 5.3.2, is!! 


u(t) 


Li.m. ul” (t) where ul”) (t) = Ss" S> tikm9k,m(t), 
(harace m=—nk=—n 


bid PM. Aa te ; u(t) yp (t) at. 


Since u(”)(t) is time-limited, it is £1, and thus has a continuous Fourier transform which is 
defined pointwise by 


a f= S> YS tamdbem(f), (5.27) 
m=—-nk=—n 


where Wkm(f) = e2/™sine(f —k) is the k,m term of the T-spaced sinc-weighted orthonormal 
set with J = 1. By the final part of Theorem 5.3.2, the sequence of vectors a” converges to 
an Lo vector & (equivalence class of functions) denoted as the Fourier transform of u(t) and 
satisfying 


lim |/@ — @ || =0. (5.28) 


n—- Oo 


This must now be related to the functions u,(t) and t,4(f) in the theorem. First, for each 
integer ¢ > n define 


n £L 
Ut) = So SS Gemtemls)s (5.29) 
m=—n k=—£ 
Since this is a more complete partial expansion than ah” ( f), 
Ja— aM] > |ja-a| 


In the limit 0 — 00, &" is the Fourier transform ti4(f) of ua(t) for A = n+ 3. Combining 
this with (5.28), 


Jim || a — tyyt (= (5.30) 
Finally, taking the limit of the finite dimensional energy equation, 
n n 
Pe P = SPE del? = IP, 
k=—-nm=—-n 


we get the Ly energy equation, ||w||? = ||a@||?. This also shows that ||@— &,|| is monotonic in A 
so that (5.30) can be replaced by 


Jim || — @,411] = 0. 


"Note that {0%,m;k,m € Z} is a countable set of orthonormal vectors, and they have been arranged in an order 
so that, for all n € Zt, all terms with |k| <n and |m| < n come before all other terms. 
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Proof of Theorem 4.5.2 (Plancherel 2): By time/frequency duality with Theorem 4.5.1, 
we see that Li.m.p_..0ug(t) exists and we denote this by F~'(a(f)). The only remaining thing 
to prove is that this inverse transform is £2 equivalent to the original u(t). Note first that 
the Fourier transform of 09,0(t) = rect(t) is sinc(f) and that the inverse transform, defined as 
above, is £2 equivalent to rect(t). By time and frequency shifts, we see that u‘")(t) is the inverse 
transform, defined as above, of ii") (f). It follows that limp. ||F~!(&) — u\™|| = 0, so we see 
that ||F~1(a) — ul] = 0. 
As an example of the Plancherel theorem, let h(t) be 1 on the rationals in (0, 1) and be zero 
elsewhere. Then h is both £; and £2 and has a Fourier transform h( f) = 0 which is continuous, 
£1, and Ly. The inverse transform is also 0 and equal to h(t) a.e. 


The function h(t) above is in some sense trivial since it is £2 equivalent to the zero function. The 
next example to be discussed is £2, nonzero only on and £j, but all members of its equivalence 
class are discontinuous everywhere and unbounded in every interval. 


We now discuss an example of a real £2 function that is nonzero only on the interval (0,1). This 
function is £1, has a continuous Fourier transform, but all functions in its equivalence class are 
discontinuous everywhere and unbounded over every open interval within (0,1). This example 
will illustrate how truly Bizarre functions can have nice Fourier transforms and vice versa. It 
will also be used later to illustrate some properties of £2 functions. 


Example 5A.1 (A Bizarre £2 and £; function)). List the rationals in (0,1) by increasing 
denominator, i.e., as a4j=1/2, ag=1/3, ag=2/3, ag=1/4, a5=3/4, ag=1/5,---. Define 


1 for dite eo 
0 elsewhere, 


Ss 
3 
— 
oo 
Nw 


g(t) = So grit). 
n=1 


Thus g(t) is a sum of rectangular functions, one for each rational number, with the width of 
the function going to zero rapidly with the index of the rational number (see Figure 5.3). The 
integral of g(t) can be calculated as 


‘ . = —n-1 1 
/ atta = 7 f goltyae= 52 =5 


r= 


Thus g(t) is an £; function as illustrated in Figure 5.3. 


97 
1 3 
4 4 


Figure 5.3: First 7 terms of )°, g;(t) 


5 


g2 


1 2 1 
5 


3 


why 


Consider the interval [5 3+ x) corresponding to the rectangle gg in the figure. Since the rationals 


are dense over the real line, there is a rational, say aj, in the interior of this interval, and thus 
a new interval starting at a; over which g1, 93, and g; all have value 1; thus g(t) > 3 within this 
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new interval. Moreover, this same argument can be repeated within this new interval, which 
again contains a rational, say aj. Thus there is an interval starting at aj where g1, 93, 9;, and 
g;' ave 1 and thus g(t) > 4. 


Iterating this argument, we see that (5, 3 + $) contains subintervals within which g(t) takes on 
arbitrarily large values. In fact, by taking the limit a, a3, a;,a;’,..., we find a limit point a for 


which g(a) = oo. Moreover, we can apply the same argument to any open interval within (0, 1) 
to show that g(t) takes on infinite values within that interval.!? More explicitly, for every ¢ > 0 
and every ¢ € (0,1), there is a t/ such that |t — t'| < € and g(t’) = 00. This means that g(t) is 
discontinuous and unbounded in each region of (0,1). 


The function g(t) is also in £2 as seen below: 


1 
2 
if g(t) dt » / an(t)Gmn(t) dt (5.31) 


Df shea +235 Yo f anlt) gmt) at (5.32) 


(5.33) 


IA 
Nile 
+ 
bo 
M 
2M 
ee 
& 
Bm 
& 
II 
NI & 


where in (5.33) we have used the fact that g2(t) = gn(t) in the first term and g,(t) < 1 in the 
second term. 


In conclusion, g(t) is both £1 and £2 but is discontinuous everywhere and takes on infinite values 
at points in every interval. The transform g(f) is continuous and £2 but not £1. The inverse 
transform, gp(t) of 4(f rect (35) is continuous, and converges in L2 to g(t) as B — oo. For 
B = 2*, the function g(t) is roughly approximated by g;(t) + +--+ g,(t), all somewhat rounded 
at the edges. 


This is a nice example of a continuous function g(f) which has a bizarre inverse Fourier transform. 
Note that g(t) and the function h(t) that is 1 on the rationals in (0,1)and 0 elsewhere are both 
discontinuous everywhere in (0,1). However, the function h(t) is 0 a.e., and thus is weird only in 
an artificial sense. For most purposes, it is the same as the zero function. The function g(t) is 
weird in a more fundamental sense. It cannot be made respectable by changing it on a countable 
set of points. 


One should not conclude from this example that intuition cannot be trusted, or that it is 
necessary to take a few graduate math courses before feeling comfortable with functions. One can 
conclude, however, that the simplicity of the results about Fourier transforms and orthonormal 
expansions for £2 functions is truly extraordinary in view of the bizarre functions included in 
the £2 class. 


In summary, Plancherel’s theorem has taught us two things. First, Fourier transforms and 
inverse transforms exist for all £2 functions. Second, finite-interval and finite-bandwidth ap- 
proximations become arbitrarily good (in the sense of £2 convergence) as the interval or the 
bandwidth becomes large. 


"The careful reader will observe that g(t) is not really a function R > R, but rather a function from R to the 
extended set of real values including oo and —oo. The set of t on which g(t) = co has zero measure and this can 
be ignored in Lebesgue integration. Do not confuse a function that takes on an infinite value at some isolated 
point with a unit impulse at that point. The first integrates to 0 around the singularity, whereas the second is a 
generalized function that by definition integrates to 1. 
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5A.2 The sampling and aliasing theorems 


This section contains proofs of the sampling and aliasing theorems. The proofs are important 
and not available elsewhere in this form. However, they involve some careful mathematical 
analysis that might be beyond the interest and/or background of many students. 


Proof of Theorem 4.6.2 (Sampling Theorem): Let ii(f) be an £2 function that is zero 
outside of [—W,W]. From Theorem 4.3.2, t(f) is £1, so by Lemma 4.5.1, 


WwW 
ens i ene af (5.34) 


holds at each t € R. We want to show that the sampling theorem expansion also holds at each 
t. By the DTFT theorem, 


£ 


af) = Lim. a(f), where a(f)= S~ unde(f) (5.35) 


loo =a, 


and where ¢,(f) = e727**f/C) rect (stv) acl 


a oe, 
ue = 50 a tlie df. (5.36) 


Comparing (5.34) and (5.36), we see as before that 2Wu; = u( sty) The functions bx(f) are in 
£1, so the finite sum @(f) is also in £1. Thus the inverse Fourier transform 


é 
k 
Ow = | 4 = ee Ss 
u’(€) fe (f) df »» u( ayy) sinc(2Wt — k) 
is defined pointwise at each t. For each ¢ € R, the difference u(t) — u(t) is then 
w 


u(t) — u(t) = / fac f) — al (fyle2""F* af. 


—W 


This integral can be viewed as the inner product of &(f) — @(f) and e***ftrect| fa], so, by 
the Schwarz inequality, we have 


lu(t) — u(t)| < V2WIa — a, 


From the £2 convergence of the DTFT, the right side approaches 0 as @ — ov, so the left side 
also approaches 0 for each t, establishing pointwise convergence. 


Proof of Theorem 4.6.3 (Sampling theorem for transmission): For a given W, assume 
that the sequence {u(sty);k € Z} satisfies >, |u(sty)|? < 00. Define uz = ayu(sty) for each 
k € Z. By the DTFT theorem, there is a frequency function u(f), nonzero only over [—W, W), 
that satisfies (4.60) and (4.61). By the sampling theorem, the inverse transform u(t) of i(f) has 
the desired properties. 


Proof of Theorem 4.7.1 (Aliasing theorem): We start by separating u(f) into frequency 
slices {tm(f);m € Z}, 


a(f) = So im(f), where m(f) = a(f)rect!( fT — m). (5.37) 
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The function rect'(f) is defined to equal 1 for —5 <p Ss $ and 0 elsewhere. It is £2 equivalent 
to rect(f), but gives us pointwise equality in (5.37). For each positive integer n, define i) (f) 
as 


n 


OM) = SO om(f) = { eee (5.38) 


0 elsewhere. 
m=—n 
It is shown in Exercise 5.16 that the given conditions on u(f) imply that u(f) is in £1. In 
conjunction with (5.38), this implies that 


co 


lim [|  |a(f)-—6™(f)| df =0. 


—= 
N00 Joo 


Since af) — 6™(f) is in £1, the inverse transform at each t satisfies 


u(t) = 0G] = | fas) - nen" af 


loc) 
| 
Since ti(f) is in £1, the final integral above approaches 0 with increasing n. Thus, for each f, 
we have 


IA 
= 


(1) -0(f)| af = f la(f)| af 


|f|2(2n+1)/(2T) 


u(t) = lim v(t). (5.39) 


n—Oo 


Next define 8(f) as the frequency slice ém(f) shifted down to baseband, i.e., 


r m 5 m 
8m(f) = Om(f — a) = Wf — a)rect!(fT). (5.40) 
Applying the sampling theorem to v,,(t), we get 
t 
Um(t) = » Unt hE) sine(; Be) oie (5.41) 


Applying the frequency shift relation to (5.40), we see that s;(t) = vm(t)e~?7*, and thus 


s(t) = So Um (ED) sine(a =i) (5.42) 
k 


Now define 8((f) = 3°" __, 8m(f). From (5.40), we see that 8'")(f) is the aliased version of 


m 
6” (f), as illustrated in Figure 4.10. The inverse transform is then 


Moj= > om(KT) sine( = — hi. (5.43) 
k=—-comM=—-n 


We have interchanged the order of summation, which is valid since the sum over m is finite. 
Finally, define §(f) to be the “folded” version of ti(f) summing over all m, i.e., 


a(f) = lim. 3™(f). (5.44) 


n—- Co 
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Exercise 5.16 shows that this limit converges in the £2 sense to an £2 function $(f). Exercise 
4.38 provides an example where &(f) is not in £2 if the condition lim, f)_,.. i(f)| f|'** = 0 is not 
satisfied. 


Since &(f) is in £2 and is 0 outside [-s, sr); the sampling theorem shows that the inverse 
transform s(t) satisfies 


; t 
s(t) = ps s(kT)sine( = —k). (5.45) 
Combining this with (5.43), 


n 


s(t) — s(t) = So | s&T) — SZ vm(kT) sine( = ~ h). (5.46) 
k 


ma=—n 


From (5.44), we see that limp... ||s — s || = 0, and thus 


lim S$ |s(kT) — vo (kT)|? = 0. 
k 


This implies that s(kT) = limp oo u™(kT) for each integer k. From (5.39), we also have 
u(kT) = limp oo vu“ (kT), and thus s(kT) = u(kT) for each k € Z. 


s(t) = 0 u(RT)sine(z ay (5.47) 


k 


This shows that (5.44) implies (5.47). Since s(t) is in Lo, it follows that >>, |u(kT)|? < oo. 
Conversely, (5.47) defines a unique £2 function, and thus its Fourier transform must be Lo 
equivalent to §(f) as defined in (5.44). 


5A.3  Prolate spheroidal waveforms 


The prolate spheroidal waveforms (see [29]) are a set of orthonormal functions that provide a 
more precise way to view the degree-of-freedom arguments of Section 4.7.2. For each choice of 
baseband bandwidth W and time interval [—T'/2, T'/2], these functions form an orthonormal set 
{(t), @:(t),... , } of real Le functions time-limited to [—T'/2, 7/2]. In a sense to be described, 
these functions have the maximum possible energy in the frequency band (—W,W) subject to 
their constraint to [—T'/2, T/2]. 


To be more precise, for each n > 0 let dbn( f) be the Fourier transform of ¢,,(t), and define 


~~ _f On(f) for -W<t<W 
Onl f) = { 0 elsewhere. O22) 


That is, 0,(t) is dn(t) truncated in frequency to (—W, W); equivalently, 6,(t) may be viewed as 
the result of passing ¢,(t) through an ideal low-pass filter. 

The function ¢o(t) is chosen to be the normalized function ¢o(t) : (—T/2,T/2) — R that 
maximizes the energy in 9(t). We will not show how to solve this optimization problem. 
However, ¢o(t) turns out to resemble \/1/T’ rect(4), except that it is rounded at the edges to 
reduce the out-of-band energy. 
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Similarly, for each n > 0, the function ¢,,(¢) is chosen to be the normalized function {¢,(t) : 
(—T/2,T/2) — R} that is orthonormal to ¢m(t) for each m < n and, subject to this constraint, 
maximizes the energy in 6,,(t). 


Finally, define \, = ||@,||?. It can be shown that 1 > Ay > A; > --- . We interpret \,, as the 
fraction of energy in @,, that is baseband-limited to (-W, W). The number of degrees of freedom 
in (—T/2,T/2),(—W, W) is then reasonably defined as the largest n for which \,, is close to 1. 


The values \,, depend on the product TW, so they can be denoted by A,,(7W). The main result 
about prolate spheroidal wave functions, which we do not prove, is that for any ¢ > 0, 


lim Ap(TW) = 


1 for n < 2TW(1 — €) 
TW- co 


0 for n > 2TW(1 +). 


This says that when T'W is large, there are close to 2T7'W orthonormal functions for which most of 
the energy in the time-limited function is also frequency-limited, but there are not significantly 
more orthonormal functions with this property. 


The prolate spheroidal wave functions ¢,,(¢) have many other remarkable properties, of which 
we list a few: 


e For each n, ¢,(t) is continuous and has n zero crossings. 
e ¢,(t) is even for n even and odd for n odd. 
e 0,(t) is an orthogonal set of functions. 


e In the interval (—T/2,T/2), On(t) = Angn(t). 
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5.E Exercises 


5.1. (basis) Prove Theorem 5.1.1 by first suggesting an algorithm that establishes the first item 
and then an algorithm to establish the second item. 

5.2. Show that the 0 vector can be part of a spanning set but cannot be part of a linearly 
indepenendent set. 

5.3. (basis) Prove that if a set of n vectors uniquely spans a vector space V, in the sense that 
each v € Y has a unique representation as a linear combination of the n vectors, then those 
n vectors are linearly independent and VY is an n-dimensional space. 

5.4. (IR?) (a) Show that the vector space R? with vectors {v = (v1,v2)} and inner product 
(vu, U) = v1 U1 + V2U2 satisfies the axioms of an inner product space. 

(b) Show that, in the Euclidean plane, the length of v (i.e., the distance from 0 to v is 
ale 

(c) Show that the distance from v to wu is ||v — ull. 

(d) Show that cos(Z(v, u)) = ce assume that ||w|| > 0 and ||v|| > 0. 

(e) Suppose that the definition of the inner product is now changed to (v, w) = vju1+2v2U2. 
Does this still satisfy the axioms of an inner product space? Does the length formula and 
the angle formula still correspond to the usual Euclidean length and angle? 

5.5. Consider C” and define (v, wu) as aa cjuju; where c1,... ,Cn are complex numbers. For 
each of the following cases, determine whether C” must be an inner product space and 
explain why or why not. 

(a) The c; are all equal to the same positive real number. 

(b) The c; are all positive real numbers. 

(c) The c; are all non-negative real numbers. 

(d) The c; are all equal to the same nonzero complex number. 

(e) The c; are all nonzero complex numbers. 

5.6. (Triangle inequality) Prove the triangle inequality, (5.10). Hint: Expand ||v+ ull? into four 
terms and use the Schwarz inequality on each of the two cross terms. 

5.7. Let w and v be orthonormal vectors in C” and let w = w,u+wyv and © = 2%,U+2yv be 
two vectors in the subspace spanned by wu and v. 

(a) Viewing w and 2 as vectors in the subspace C?, find (w, 2). 


(b) Now view w and x as vectors in C”, e.g., w = (wi,... ,Wn) where wj = wyuj + Wyr; 
for 1 <j <n. Calculate (w, x) this way and show that the answer agrees with that in 
part (a). 


5.8. (£2 inner product) Consider the vector space of £2 functions {u(t) : R— C}. Let v and u 
be two vectors in this space represented as u(t) and u(t). Let the inner product be defined 
by 

(oe) 
(v,u) = ih u(t)u* (€) dt. 
—co 
(a) Assume that u(t) = 0pm &kmIk,m(t) where {0x,m(t)} is an orthogonal set of functions 
each of energy T. Assume that v(t) can be expanded similarly. Show that 


(ue) =T >" Gem oe 
kym 
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(b) Show that (w, v) is finite. Do not use the Schwarz inequality, because the purpose of 
this exercise is to show that £2 is an inner product space, and the Schwarz inequality is 
based on the assumption of an inner product space. Use the result in (a) along with the 
properties of complex numbers (you can use the Schwarz inequality for the one dimensional 
vector space C! if you choose). 


(c) Why is this result necessary in showing that £2 is an inner product space? 


5.9. (£2 inner product) Given two waveforms uj, U2 € La, let V be the set of all waveforms v 
that are equi-distant from wu, and uz. Thus 


V = {v: |v — will = [lv — wall}. 


(a) Is V a vector sub-space of Ly? 


(b) Show that 
_ [uall? - [si 


V = {vz ((v, 2 — ui) 5 


(c) Show that (uw; + u2)/2 EV 


(d) Give a geometric interpretation for V. 


5.10. (sampling) For any £2 function {u(t) : [-W,W] > C} and any t, let a, = u(sty) and let 


by = sinc(2Wt — k). Show that 57, |az|? < oo and S°, |by|? < oo. Use this to show that 
>>, |@xdk| < co. Use this to show that the sum in the sampling equation (4.65) converges 
for each t. 


5.11. (projection) Consider the following set of functions {um(t)} for integer m > 0: 


1, O<t<il; 
t = 9 — ; p) 
uo(t) { 0 otherwise. 
1, ORF 20 
t = 9 — : 1 
Um (4) { 0 otherwise. 
Consider these functions as vectors ug, u;... , over real £2 vector space. Note that ug is 


normalized; we denote it as dy = uo. 

(a) Find the projection (u1)|g, of wi on po, find the perpendicular (w1)14,, and find the 
normalized form @, of (u1)1g,- Sketch each of these as functions of t. 

(b) Express ui(t — 1/2) as a linear combination of @) and @,. Express (in words) the 
subspace of real £2 spanned by u;(t) and u;(t — 1/2). What is the subspace S; of real £2 
spanned by @p and @,? 

(c) Find the projection (wz))s5, of ug on Sj, find the perpendicular (w2)1s,, and find the 
normalized form of (w2)1s5,- Denote this normalized form as 9 9; it will be clear shortly 
why a double subscript is used here. Sketch 5 9 as a function of t. 

(d) Find the projection of ug(t — 1/2) on S; and find the perpendicular ug(t — 1/2). s,. 
Denote the normalized form of this perpendicular by $2). Sketch 2, as a function of ¢ 
and explain why (5, 621) = 0. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


166 CHAPTER 5. VECTOR SPACES AND SIGNAL SPACE 


(e) Express ug(t — 1/4) and ua(t — 3/4) as linear combinations of {¢ 9, 61, 62,9, b2,,}- Let 
S2 be the subspace of real £2 spanned by $9, $1, 2,9, 2,1 and describe this subspace in 
words. 
(f) Find the projection (u3)\s, of ug on So, find the perpendicular (u2)1s,, and find its 
normalized form, @39. Sketch @3 9 as a function of ¢. 
(g) For 7 = 1,2,3, find ug(t — j/4)1.s, and find its normalized form $3. Describe the 
subspace S3 spanned by $9, $1, 62,0; P21; P3,0)- ++ > P3,3- 
(h) Consider iterating this process to form S4,S5,..... What is the dimension of S,,? 
Describe this subspace. Describe the projection of an arbitrary real £2 function constrained 
to the interval [0,1) on S,,. 

5.12. (Orthogonal subspaces) For any subspace S of an inner product space V, define S+ as the 
set of vectors v € V that are orthogonal to all w € S. 
(a) Show that S+ is a subspace of V. 
(b) Assuming that S is finite dimensional, show that any u € V can be uniquely decomposed 
into w= ujs + Ug where ujs € S and us € Ss. 


(c) Assuming that V is finite dimensional, show that V has an orthonormal basis where 
some of the basis vectors form a basis for S and the remaining basis vectors form a basis 
for S+. 


5.13. (Orthonormal expansion) Expand the function sinc(3t/2) as an orthonormal expansion in 
the set of functions {sinc(t — n); —co <n < co}. 


5.14. (bizarre function) (a) Show that the pulses g,(t) in Example 5A.1 of Section 5A.1 overlap 
each other either completely or not at all. 
(b) Modify each pulse g,(t) to hy(t) as follows: Let hy(t) = gn(t) if ee, gi(t) is even and 
let hn(t) = —gn(t) if 2"! gi(t) is odd. Show that 37"_, h;(t) is bounded between 0 and 1 
for each t € (0,1) and each n > 1. 
(c) Show that there are a countably infinite number of points t at which }>,, hn (t) does not 
converge. 


5.15. (Parseval) Prove Parseval’s relation, (4.44) for £2 functions. Use the same argument as 
used to establish the energy equation in the proof of Plancherel’s theorem. 

5.16. (Aliasing theorem) Assume that @(f) is £2 and lim) f)_.. @(f)|f|'** = 0 for some e > 0. 
(a) Show that for large enough A > 0, |a(f)| < |f|~1- for |f| > A. 


(b) Show that a(f) is £1. Hint: for the A above, split the integral f |&(f)|df into one 
integral for |f| > A and another for |f| < A. 


(c) Show that, for T = 1, 8(f) as defined in (5.44), satisfies 


(d) Show that 5(f) is £2 for T = 1. Use scaling to show that 8(f) is £2 for any T > 0. 
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Chapter 6 


Channels, modulation, and 
demodulation 


6.1 Introduction 


Digital modulation (or channel encoding) is the process of converting an input sequence of bits 
into a waveform suitable for transmission over a communication channel. Demodulation (channel 
decoding) is the corresponding process at the receiver of converting the received waveform into a 
(perhaps noisy) replica of the input bit sequence. Chapter 1 discussed the reasons for using a bit 
sequence as the interface between an arbitrary source and an arbitrary channel, and Chapters 
2 and 3 discussed how to encode the source output into a bit sequence. 


Chapters 4 and 5 developed the signal-space view of waveforms. As explained there, the source 
and channel waveforms of interest can be represented as real or complex! £2 vectors. Any such 
vector can be viewed as a conventional function of time, x(t). Given an orthonormal basis 
{¢1(t), do(t),... ,} of £2, any such z(t) can be represented as 


x(t) = S~ a4¢;(t). (6.1) 
J 


Each x; in (6.1) can be uniquely calculated from x(t), and the above series converges in £2 to 
x(t). Moreover, starting from any sequence satisfying }> j |x;|? < oo there is an £2 function x(t) 
satisfying (6.1) with £2 convergence. This provides a simple and generic way of going back and 
forth between functions of time and sequences of numbers. The basic parts of a modulator will 
then turn out to be a procedure for mapping a sequence of binary digits into a sequence of real 
or complex numbers, followed by the above approach for mapping a sequence of numbers into a 
waveform. 


In most cases of modulation, the set of waveforms ¢1(t), d2(t),... , in (6.1) will be chosen not 
as a basis for Ly but as a basis for some subspace” of £2 such as the set of functions that are 
baseband limited to some frequency W or passband limited to some range of frequencies. In 
some cases, it will also be desirable to use a sequence of waveforms that are not orthonormal. 


‘As explained later, the actual transmitted waveforms are real. However, they are usually bandpass real 
waveforms that are conveniently represented as complex baseband waveforms. 

*Equivalently, ¢1(t), 62(t),... , can be chosen as a basis of £2 but the set of indices for which x; is allowed to 
be nonzero can be restricted. 
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We can view the mapping from bits to numerical signals and the conversion of signals to a 
waveform as separate layers. The demodulator then maps the received waveform to a sequence 
of received signals, which is then mapped to a bit sequence, hopefully equal to the input bit 
sequence. A major objective in designing the modulator and demodulator is to maximize the 
rate at which bits enter the encoder, subject to the need to retrieve the original bit stream with 
a suitably small error rate. Usually this must be done subject to constraints on the transmitted 
power and bandwidth. In practice there are also constraints on delay, complexity, compatibility 
with standards, etc., but these need not be a major focus here. 


Example 6.1.1. As a particularly simple example, suppose a sequence of binary symbols enters 
the encoder at T-spaced instants of time. These symbols can be mapped into real numbers using 
the mapping 0 — +1 and 1 — —1. The resulting sequence wu, u2,... , of real numbers is then 
mapped into the transmitted waveform 


u(t) = » up sinc (= = k) (6.2) 


At the receiver, in the absence of noise, attenuation, and other imperfections, the received 
waveform is u(t). This can be sampled at times T),7>,... , to retrieve ui, u2,..., which can be 
decoded into the original binary symbols. 


The above example contains rudimentary forms of the two layers discussed above. The first is 
the mapping of binary symbols into numerical signals? and the second is the conversion of the 
sequence of signals into a waveform. In general, the set of T-spaced sinc functions in (6.2) can 
be replaced by any other set of orthogonal functions (or even non-orthogonal functions). Also, 
the mapping 0 > +1, 1 — —1 can be generalized by segmenting the binary stream into b-tuples 
of binary symbols, which can then be mapped into n-tuples of real or complex numbers. The 
set of 2° possible n-tuples resulting from this mapping is called a signal constellation. 


Modulators usually include a third layer, which maps a baseband encoded waveform, such as u(t) 
in (6.2), into a passband waveform x(t) = R{u(t)e?”4"} centered on a given carrier frequency 
fc. At the decoder this passband waveform is mapped back to baseband before performing the 
other components of decoding. This frequency conversion operation at encoder and decoder is 
often referred to as modulation and demodulation, but it is more common today to use the 
word modulation for the entire process of mapping bits to waveforms. Figure 6.1 illustrates 
these three layers. 


We have illustrated the channel above as a one way device going from source to destination. 
Usually, however, communication goes both ways, so that a physical location can send data to 
another location and also receive data from that remote location. A physical device that both 
encodes data going out over a channel and also decodes oppositely directed data coming in from 
the channel is called a modem (for modulator /demodulator). As described in Chapter 1, feedback 
on the reverse channel can be used to request retransmissions on the forward channel, but in 
practice, this is usually done as part of an automatic retransmission request (ARQ) strategy in 
the data link control layer. Combining coding with more sophisticated feedback strategies than 


°The word signal is often used in the communication literature to refer to symbols, vectors, waveforms, or 
almost anything else. Here we use it only to refer to real or complex numbers (or n-tuples of numbers) in situations 
where the numerical properties are important. For example, in (6.2) the signals (numerical values) ui, u2,... 
determine the real valued waveform u(t), whereas the binary input symbols could be ‘Alice’ and ‘Bob’ as easily 
as 0 and 1. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


6.2. PULSE AMPLITUDE MODULATION (PAM) 169 


Binary Bits to . Signals to . Baseband to . 
Input signals waveform passband 
sequence of baseband passband |Channel 
signals waveform waveform 


7 T v 


Binary Signal Waveform Passband to 
Output | decoder to signals baseband ; 


Figure 6.1: The layers of a modulator (channel encoder) and demodulator (channel decoder). 


ARQ has always been an active area of communication and information theoretic research, but 
it will not be discussed here for the following reasons: 


e It is important to understand communication in a single direction before addressing the 
complexities of two directions. 
e Feedback does not increase channel capacity for typical channels (see [28]). 
e Simple error detection and retransmission is best viewed as a topic in data networks. 
There is an interesting analogy between analog source coding and digital modulation. With 


analog source coding, an analog waveform is first mapped into a sequence of real or complex 
numbers (e.g., the coefficients in an orthogonal expansion). This sequence of signals is then 
quantized into a sequence of symbols from a discrete alphabet, and finally the symbols are 
encoded into a binary sequence. With modulation, a sequence of bits is encoded into a sequence 
of signals from a signal constellation. The elements of this constellation are real or complex 
points in one or several dimensions. This sequence of signal points is then mapped into a 
waveform by the inverse of the process for converting waveforms into sequences. 


6.2 Pulse amplitude modulation (PAM) 


Pulse amplitude modulation* (PAM) is probably the the simplest type of modulation. The 
incoming binary symbols are first segmented into b-bit blocks. There is a mapping from the set 
of M = 2° possible blocks into a signal constellation A = {a1,a2,... ,a,¢} of real numbers. Let 
R be the rate of incoming binary symbols in bits per second. Then the sequence of b-bit blocks, 
and the corresponding sequence, uy, U2,..., of M-ary signals, has a rate of R, = R/b signals 
per second. The sequence of signals is then mapped into a waveform u(t) by the use of time 
shifts of a basic pulse waveform p(t), 7.e., 


u(t) = 5° ug p(t — kT), (6.3) 
ie 


where T = 1/R, is the interval between successive signals. The special case where b = 1 is 
called binary PAM and the case 6 > 1 is called multilevel PAM. Example 6.1.1 is an example 


“The terminology comes from analog amplitude modulation, where a baseband waveform is modulated up 
to some passband for communication. For digital communication, the more interesting problem is turning a bit 
stream into a waveform at baseband. 
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of binary PAM where the basic pulse shape p(t) is a sinc function. Comparing (6.1) with (6.3), 
we see that PAM is a special case of digital modulation in which the underlying set of functions 
o1(t), d2(t),... , is replaced by functions that are T-spaced time shifts of a basic function p(t). 
The following two subsections discuss the signal constellation (i.e., the outer layer in Figure 
6.1) and the subsequent two discuss the choice of pulse waveform p(t) (7.e., the middle layer in 
Figure 6.1). In most cases”, the pulse waveform p(t) is a baseband waveform and the resulting 
modulated waveform u(t) is then modulated up to some passband (i.e., the inner layer in Figure 
6.1). Section 6.4 discusses modulation from baseband to passband and back. 


6.2.1 Signal constellations 


A standard M-PAM signal constellation A (see Figure 6.2) consists of M = 2° d-spaced real 
numbers located symmetrically about the origin; 7.e., 


—d(M-—1) -d d d(M-1) 


Am {By Gr SS 


iu 


In other words, the signal points are the same as the representation points of a symmetric 
M-point uniform scalar quantizer. 


ay ag a3 a4 a5 ag a7 ag 


Figure 6.2: An 8-PAM signal set. 


If the incoming bits are independent equiprobable random symbols (which is well approximated 
by effective source coding), then each signal uz is a sample value of a random variable U; that is 
equiprobable over the constellation (alphabet) A. Also the sequence Uj, U2,... , is independent 
and identically distributed (iid). As derived in Exercise 6.1, the mean squared signal value, or 
“energy per signal” E, = E[U;] is then given by 

d(M?—1) — d?(2?”-1) 


E, = = . A 
= 19 12 (6.4) 


For example, for M = 2,4 and 8, we have E, = d?/4, 5d?/4 and 21d?/4, respectively. 
For b greater than 2, 2?’ —1 is approximately 27°, so we see that each unit increase in b increases 
E, by a factor of 4. Thus increasing the rate R by increasing b requires impractically large 


energy for large b. 


Before explaining why standard M-PAM is a good choice for PAM and what factors affect the 
choice of constellation size M and distance d, a brief introduction to channel imperfections is 
required. 


*Ultra-wide-band modulation (UAW) is an interesting modulation technique where the transmitted waveform 
is essentially a baseband PAM system over a ‘baseband’ of multiple gigahertz. This is discussed briefly in Chapter 
9. 
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6.2.2. Channel imperfections: a preliminary view 


Physical waveform channels are always subject to propagation delay, attenuation, and noise. 
Many wireline channels can be reasonably modeled using only these degradations, whereas 
wireless channels are subject to other degrations discussed in Chapter 9. This subsection provides 
a preliminary look at delay, then attenuation, and finally noise. 


The time reference at a communication receiver is conventionally delayed relative to that at the 
transmitter. A waveform u(t) at the transmitter is subject to propagation delay plus various 
filter delays in the modulator and demodulator. Thus u(t), according to the transmitter clock, 
appears as u(t—T) at the receiver, where 7 is the overall delay. By delaying the receiver clock by + 
from the transmitter clock, the received waveform, according to the receiver clock, is u(t). With 
this convention, the channel can be modeled as having no delay, and all equations will be greatly 
simplified. This explains why communication engineers often model filters in the modulator and 
demodulator as being noncausal, since responses before time 0 can be added to the difference 
between the two clocks. Estimating the above fixed delay at the receiver is a significant problem 
called timing recovery, but is largely separable from the problem of recovering the transmitted 
data. 


The magnitude of delay in a communication system is often important. It is one of the param- 
eters often referred to as quality of service in a communication system. Delay is important for 
voice communication and often critically important when the communication is in the feedback 
loop of a real time control system. In addition to the fixed delay in time reference between mod- 
ulator and demodulator, there is also delay in source encoding and decoding. Coding for error 
correction adds additional delay, which might or might not be counted as part of the modula- 
tor/demodulator delay. Either way, the delays in the source coding and error-correction coding 
are often much larger than that in the modulator/demodulator proper. Thus this latter delay 
can be significant, but is usually not of primary significance. Also, as channel speeds increase, 
the filtering delays in the modulator/demodulator become even less significant. 


Amplitudes are usually measured on a different scale at transmitter and receiver. The actual 
power attenuation suffered in transmission is a product of amplifier gain, antenna coupling 
losses, antenna directional gain, propagation losses, etc. The process of finding all these gains 
and losses (and perhaps changing them) is called “the link budget.” Such gains and losses are 
invariably calculated in decibels (dB). The number of decibels corresponding to a power gain 
a is defined to be 10 log, a. Thus power losses correspond to negative dB and power gains to 
positive dB. The use of a logarithmic measure of gain allows the various components of gain to 
be added rather than multiplied. 


The use of decibels rather than some other logarithmic measure such as natural logs or logs to 
the base 2 is partly motivated by the ease of doing rough mental calculations. A factor of 2 is 
10 log,) 2 = 3.010--- dB, approximated as 3 dB. Thus 4 = 2? is 6 dB and 8 is 9 dB. Since 10 
is 10 dB, we also see that 5 is 10/2 or 7 dB. We can just as easily see that 20 is 13 dB and so 
forth. 


It is important to remember that the gains expressed in dB are power gains. Thus if there is a 
multiplicative gain of g in a signal, this corresponds to a gain g? in power, which corresponds 
to 20 logy) g dB. 


The link budget in a communication system is largely separable from other issues, so the am- 
plitude scale at the transmitter is usually normalized to that at the receiver. 
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By treating attenuation and delay as issues largely separable from modulation, we obtain a model 
of the channel in which a baseband waveform u(t) is converted to passband and transmitted. At 
the receiver, after conversion back to baseband, a waveform v(t) = u(t) + z(t) is received where 
z(t) is noise. This noise is a fundamental limitation to communication and arises from a variety 
of causes, including thermal effects and unwanted radiation impinging on the receiver. Chapter 
7 is largely devoted to understanding noise waveforms by modeling them as sample values of 
random processes. Chapter 8 then explains how best to decode signals in the presence of noise. 
These issues are briefly summarized here to see how they affect the choice of signal constellation. 


If p(t) is orthogonal to all its shifts by multiples of T, then, in the absence of noise, the transmit- 
ted signals uj,u2,... , can be retrieved from the baseband waveform u(t) by the inner product 
operation, 


wise if u(t) p(t — KT) dt. 


In the presence of noise, this same operation can be performed, yielding 
a= / Ap R= ae eae (6.5) 


where z, = | z(t) p(t — kT) dt is the projection of z(t) on the shifted pulse p(t — kT). 


The most common (and often the most appropriate) model for noise on channels is called the 
additive white Gaussian noise model. As shown in Chapters 7 and 8, the above coefficients 
{z~;k © Z} in this model are the sample values of zero-mean, iid Gaussian random variables 
{Zx;k € Z}. This is true no matter how the orthonormal functions {p(t—kT); k € Z} are chosen, 
and these random variables are also independent of the signal random variables {U,;k € Z}. 
Chapter 8 also shows that the operation in (6.5) is the appropriate operation to go from waveform 
to signal sequence in the layered demodulator of Figure 6.1. 


Now consider the effect of the noise on the choice of M and d in a PAM modulator. Since the 
transmitted signal reappears at the receiver with a zero-mean Gaussian random variable added 
to it, any attempt to directly retrieve U;, from V; with reasonably small probability of error® 
will require d to exceed several standard deviations of the noise. Thus the noise determines how 
large d must be, and this, combined with the power constraint, determines M. 


The relation between error probability and signal-point spacing also helps explain why multi- 
level PAM systems almost invariably use a standard IM-PAM signal set. Because the Gaussian 
density drops off so fast with increasing distance, the error probability due to confusion of 
nearest neighbors drops off equally fast. Thus error probability is dominated by the points in 
the constellation that are closest together. If the signal points are constrained to have some 
minimum distance d between points, it can be seen that the minimum energy FE, for a given 
number of points M is achieved by the standard M-PAM set.” 


To be more specific about the relationship between M,d and the variance a? of the noise Zp, 
suppose that d is selected to be ao, where a is chosen to make the detection sufficiently reliable. 
Then with M = 2°, where b is the number of bits encoded into each PAM signal, (6.4) becomes 


a?o?(22° — 1) i 12k, 
LS b=—log (1+ =]. 
: DR? pe ( a “ict (6:6) 


®If error-correction coding is used with PAM, then d can be smaller, but for any given error-correction code, 
d still depends on the standard deviation of Z,. 

7On the other hand, if we choose a set of M signal points to minimize E, for a given error probability, then 
the standard M-PAM signal set is not quite optimal (see Exercise 6.3). 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


6.2. PULSE AMPLITUDE MODULATION (PAM) 173 


This expression looks strikingly similar to Shannon’s capacity formula for additive white Gaus- 
sian noise, which says that for the appropriate PAM bandwidth, the capacity per signal is 
C = slog(1+ Es). The important difference is that in (6.6), a must be increased, thus de- 
creasing b, in order to decrease error probability. Shannon’s result, on the other hand, says 
that error probability can be made arbitrarily small for any number of bits per signal less than 
C’. Both equations, however, show the same basic form of relationship between bits per signal 
and the signal to noise ratio E,/o?. Both equations also say that if there is no noise (a? = 0, 
then the the number of transmitted bits per signal can be infinitely large (i.e., the distance d 
between signal points can be made infinitesimally small). Thus both equations say that noise is 
a fundamental limitation on communication. 


6.2.3. Choice of the modulation pulse 


As defined in (6.3), the baseband transmitted waveform, u(t) = >>, ux p(t — kT), for a PAM 
modulator is determined by the signal constellation A, the signal interval JT’ and the real Lo 
modulation pulse p(t). 


It may be helpful to visualize p(t) as the impulse response of a linear time-invariant filter. Then 
u(t) is the response of that filter to a sequence of T-spaced impulses {u,6(t—kT)}. The problem 
of choosing p(t) for a given T turns out to be largely separable from that of choosing A. The 
choice of p(t) is also the more challenging and interesting problem. 


The following objectives contribute to the choice of p(t). 


e p(t) must be 0 for t < —r for some finite 7. To see this, assume that the kth input signal 
at the modulator is generated at time Tk — 7. The contribution of uz to the transmitted 
waveform u(t) cannot start until kT’ — 7, which implies p(t) = 0 for t < —7 as stated. This 
rules out sinc(t/T) as a choice for p(t) (although sinc(t/T) could be truncated at t = —r 
to satisfy the condition). 


e In most situations, p(f) should be essentially baseband limited to some bandwidth B, 
slightly larger than ora We will see shortly that it cannot be baseband limited to less than 
or There is usually an upper limit on By because of regulatory constraints at bandpass 
or to allow for other transmission channels in neighboring bands. If this limit were much 


larger than ara then T could be increased, increasing the rate of transmission. 


e The retrieval of the sequence {uxz; k € Z} from the noisy received waveform should be simple 
and relatively reliable. In the absence of noise, {ux;k € Z} should be uniquely specified by 
the received waveform. 


The first condition above makes it somewhat tricky to satisfy the second condition. In particular, 
the Paley-Wiener theorem [20] states that a necessary and sufficient condition for a nonzero Lo 
function p(t) to be zero for all t < 0 is that its Fourier transform satisfy 


© [In |p(A)II 
ees a Jee, (6.7) 


Combining this with the shift condition for Fourier transforms, it says that any Lo function 
that is 0 for all t < —7 for any finite delay 7 must also satisfy (6.7). This is a particularly 
strong statement of the fact that functions cannot be both time and frequency limited. One 
consequence of (6.7) is that if p(t) = 0 for t < —r, then #(f) must be nonzero except on a set of 
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measure 0. Another consequence is that p(f) must go to 0 with increasing f more slowly than 
exponentially. 


The Paley-Wiener condition turns out to be useless as a tool for choosing p(t). First, it distin- 
guishes whether the above delay 7 is finite or infinite, but gives no indication of its value when 
finite. Second, if an £2 function p(t) is chosen with no concern for (6.7), it can then be trun- 
cated to be 0 for t < —r. The resulting £2 error caused by truncation can be made arbitrarily 
small by choosing 7 sufficiently large. The tradeoff between truncation error and delay is clearly 
improved by choosing p(t) to approach 0 rapidly as t > —oco. 


In summary, we will replace the first objective above with the objective of choosing p(t) to 
approach 0 rapidly as t + —oo. The resulting p(t) will then be truncated to satisfy the original 
objective. Thus p(t) — p(f) will be an approximation to the transmit pulse in what follows. 
This also means that p(f) can be strictly bandlimited to a frequency slightly larger than oT 


We next turn to the third objective, particularly that of easily retrieving the sequence uj, u2,... , 
from u(t) in the absence of noise. This problem was first analyzed in 1928 in a classic paper 
by Harry Nyquist [19]. Before looking at Nyquist’s results, however, we must consider the 
demodulator. 


6.2.4 PAM demodulation 


For the time being, ignore the channel noise. Assume that the time reference and the amplitude 
scaling at the receiver have been selected so that the received baseband waveform is the same as 
the transmitted baseband waveform u(t). This also assumes that no noise has been introduced 
by the channel. 


The problem at the demodulator is then to retrieve the transmitted signals u,,u2,... from the 
received waveform u(t) = >>, ugp(t—kT). The middle layer of a PAM demodulator is defined by 
a signal interval T (the same as at the modulator) and a real £2 waveform q(t). The demodulator 
first filters the received waveform using a filter with impulse response q(t). It then samples the 
output at T-spaced sample times. That is, the received filtered waveform is 


rt) = a u(r)q(t — T) dr, (6.8) 


—Co 
and the received samples are r(T),r(2T),...,. 


Our objective is to choose p(t) and q(t) so that r(kT’) = ux for each k. If this objective is met 
for all choices of u1,ug2,..., then the PAM system involving p(t) and q(t) is said to have no 
intersymbol interference. Otherwise, intersymbol interference is said to exist. The reader should 


verify that p(t) = q(t) = ypsine(F) is one solution. 


This problem of choosing filters to avoid intersymbol interference at first appears to be somewhat 
artificial. First, the form of the receiver is restricted to be a filter followed by a sampler. Exercise 
6.4 shows that if the detection of each signal is restricted to a linear operation on the received 
waveform, then there is no real loss of generality in further restricting the operation to be a 
filter followed by a T-spaced sampler. This does not explain the restriction to linear operations, 
however. 


The second artificiality is neglecting the noise, thus neglecting the fundamental limitation on 
the bit rate. The reason for posing this artificial problem is, first, that avoiding intersymbol 
interference is significant in choosing p(t), and, second, that there is a simple and elegant solution 
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to this problem. This solution also provides part of the solution when noise is brought into the 
picture. 


Recall that u(t) = >>, ugp(t — kT); thus from (6.8) 
ess 3 Fe On eee (6.9) 


Let g(t) be the convolution g(t) = p(t = f{ p(r)q(t — rT) dr and assume® that g(t) is Lo. 
We can then simplify (6.9) to 


t) = Supg(t — kT). (6.10) 


This should not be surprising. The filters p(t) and q(t) are in cascade with each other. Thus r(t) 
does not depend on which part of the filtering is done in one and which in the other; it is only 
the convolution g(t) that determines r(t). Later, when channel noise is added, the individual 
choice of p(t) and q(t) will become important. 


There is no intersymbol interference if r(kT’) = us, for each integer k, and from (6.10) this is 
satisfied if g(0) = 1 and g(kT) = 0 for each nonzero integer k. Waveforms with this property 
are said to be ideal Nyquist or, more precisely, ideal Nyquist with interval T. 


Even though the clock at the receiver is delayed by some finite amount relative to that at the 
transmitter, and each signal uz can be generated at the transmitter at some finite time before 
kT, g(t) must still have the property that g(t) = 0 for t < —r for some finite r. As before with 
the transmit pulse p(t), this finite delay constraint will be replaced with the objective that g(t) 
should approach 0 rapidly as |t| — oo. Thus the function sinc(+) is ideal Nyquist with interval 
T, but is unsuitable because of the slow approach to 0 as |t| — oo. 


As another simple example, the function rect(t/T) is ideal Nyquist with interval T and can be 
generated with finite delay, but is not remotely close to being baseband limited. 


In summary, we want to find functions g(t) that are ideal Nyquist but are approximately base- 
band limited and approximately time limited. The Nyquist criterion, discussed in the next 
section, provides a useful frequency characterization of functions that are ideal Nyquist. This 
characterization will then be used to study ideal Nyquist functions that are approximately base- 
band limited and approximately time limited. 


6.3. The Nyquist criterion 


The ideal Nyquist property is determined solely by the T-spaced samples of the waveform g(t). 
This suggests that the results about aliasing should be relevant. Let s(t) be the baseband-limited 
waveform generated by the samples of g(t), i.e., 


= So gk) sine(= — f). (6.11) 
k 


’By looking at the frequency domain, it is not difficult to construct a g(t) of infinite energy from £2 functions 
p(t) and q(t). When we study noise, however, we find that there is no point in constructing such a g(t), so we 
ignore the possibility. 
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If g(t) is ideal Nyquist, then all the above terms except k = 0 disappear and s(t) = sine(4). 
Conversely, if s(t) = sinc(#), then g(t) must be ideal Nyquist. Taking the Fourier transform of 
(6.11) shows that g(t) is ideal Nyquist if and only if 


8(f) = T rect(fT). (6.12) 


From the aliasing theorem, 
3(f) = Lim. Leas ze =) rect (fT). (6.13) 


The result of combining (6.12) and (6.13) is the Nyquist criterion: 


Theorem 6.3.1 (Nyquist criterion). Let g(f) be La and satisfy the condition 
lim f|00 g(f)|f\'t© = 0 for some e > 0. Then the inverse transform, g(t), of g(f) is 
ideal Nyquist with interval T if and only if g(f) satisfies the Nyquist criterion for T, defined as® 


Lim. $~ 9(f +m/T) rect(fT) = T rect( fT). (6.14) 


Proof: From the aliasing theorem, the baseband approximation s(t) in (6.11) converges point- 
wise and is £2. Similarly, the Fourier transform §(f) satisfies (6.13). If g(t) is ideal Nyquist, 
then s(t) = sinc(#). This implies that 8(f) is £2 equivalent to Trect(fT), which in turn implies 
(6.14). Conversely, satisfaction of the Nyquist criterion (6.14) implies that §(f) = Trect(fT). 
This implies s(t) = sinc(+) implying that g(t) is ideal Nyquist. 


There are many choices for g(f) that satisfy (6.14), but the ones of major interest are those that 
are approximately both bandlimited and time limited. We look specifically at cases where g(f) is 
strictly bandlimited, which, as we have seen, means that g(t) is not strictly time limited. Before 
these filters can be used, of course, they must be truncated to be strictly time limited. It is 
strange to look for strictly bandlimited and approximately time-limited functions when it is the 
opposite that is required, but the reason is that the frequency constraint is the more important. 
The time constraint is usually more flexible and can be imposed as an approximation. 


6.3.1 Band-edge symmetry 


The nominal or Nyquist band associated with a PAM pulse g(t) with signal interval T is defined 
to be W, = 1/(2T). The actual baseband bandwidth! B, is defined as the smallest number By 
such that g(f) = 0 for |f| > By. Note that if g(f) = 0 for |f| > Ws, then the left side of (6.14) 
is zero except for m = 0, so g(f) = Trect(fT). This means that B, > W, and equality holds if 
and only if g(t) = sinc(t/T). 

As discussed above, if Wy, is much smaller than By, then W, can be increased, thus increasing 
the rate R, at which signals can be transmitted. Thus g(t) should be chosen in such a way that 


°*Tt can be seen that >, 9(f +m/T) is periodic and thus the rect(fT) could be essentially omitted from both 
sides of (6.14). Doing this, however, would make the limit in the mean meaningless and would also complicate 
the intuitive understanding of the theorem. 

It might be better to call this the design bandwidth, since after the truncation necessary for finite delay, 
the resulting frequency function is nonzero almost everywhere. However, if the delay is large enough, the energy 
outside of By is negligible. On the other hand, Exercise 6.9 shows that these approximations must be handled 
with great care. 
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By, exceeds W, by a relatively small amount. In particular, we now focus on the case where 
Wy < By < 2W>. 


The assumption B, < 2W, means that g(f) = 0 for |f| > 2W,. Thus for 0 < f < Wo, 
g(f + 2mW,) can be nonzero only for m = 0 and m = —1. Thus the Nyquist criterion (6.14) in 
this positive frequency interval becomes 


H+ Gf -2Wo)=T for O< f < Wo. (6.15) 


Since p(t) and q(t) are real, g(t) is also real, so g(f—2W,) = g*(2W,—f). Substituting this in 
(6.15) and letting A = f—W4g, (6.15) becomes 


T — g(W,+A) = g*(W,—A). (6.16) 


This is sketched and interpreted in Figure 6.3. The figure assumes the typical situation in which 
g(f) is real. In the general case, the figure illustrates the real part of g(f) and the imaginary 
part satisfies S{g(W,+A)} = S{g(W,—A)}. 


T 


T — g(Wo—A) 


f g(Wp+A) 
0 Wy By 


Figure 6.3: Band edge symmetry illustrated for real g(f): For each A, 0<A<W4, 
g(Wp+A) = T — g(W,—A). The portion of the curve for f > Wo, rotated by 180° 
around the point (W»), 7/2), is equal to the portion of the curve for f < W5. 


Figure 6.3 makes it particularly clear that By must satisfy B, > Wp, to avoid intersymbol 
interference. We then see that the choice of g(f) involves a tradeoff between making g(f) 
smooth, so as to avoid a slow time decay in g(t), and reducing the excess of By over the Nyquist 
bandwidth W,. This excess is expressed as a rolloff factor!', defined to be (B,/W,) —1, usually 
expressed as a percentage. Thus g(f) in the figure has about a 30% rolloff. 


PAM filters in practice often have raised cosine transforms. The raised cosine frequency function, 
for any given rolloff a between 0 and 1, is defined by 


i OST |S ee 
Golf) = 4 T cos? [S(\f|—Age)], At < If] < HP: (6.17) 
0, |f| > 4h. 


The requirement for a small rolloff actually arises from a requirement on the transmitted pulse p(t), i.e, on 
the actual bandwidth of the transmitted channel waveform, rather than on the cascade g(t) = p(t) * q(t). The 
tacit assumption here is that p(f) = 0 when g(f) = 0. One reason for this is that it is silly to transmit energy in 
a part of the spectrum that is going to be completely filtered out at the receiver. We see later that p(f) and (f) 
are usually chosen to have the same magnitude, ensuring that p(f) and g(f) have the same rolloff. 
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The inverse transform of g.(f) can be shown to be (see Exercise 6.8) 


t cos(rat/T) 
T’ 1—402t2/T?’ 


Go(t) = sinc( (6.18) 


which decays asymptotically as 1/t°, compared to 1/t for sinc(4). In particular, for a rolloff 
a = 1, ga(f) is nonzero from —2W, = —1/T to 2W, = 1/T and ga(t) has most of its energy 
between —T and T. Rolloffs as sharp as 5-10% are used in current practice. The resulting gq(t) 
goes to 0 with increasing |t| much faster than sinc(t/T), but the ratio of gq(t) to sinc(t/T) is a 
function of at/T and reaches its first zero at t = 1.5T/a. In other words, the required filtering 
delay is proportional to 1/a. 


The motivation for the raised cosine shape is that g(f) should be smooth in order for g(t) to 
decay quickly in time, but g(f) must decrease from T at W,(1 — a) to 0 at W,(1 + a); as seen 
in Figure 6.3, the raised cosine function simply rounds off the step discontinuity in rect (x47) in 
such a way as to maintain the Nyquist criterion while making g(f) continuous with a continuous 
derivitive, thus guaranteeing that g(t) decays asympototically with 1/t°. 


6.3.2 Choosing {p(t—kT);k € Z} as an orthonormal set 


The above subsection describes the choice of g(f) as a compromise between rolloff and smooth- 
ness, subject to band edge symmetry. As illustrated in figure 6.3, it is not a serious additional 
constraint to restrict g(f) to be real and nonnegative (why let g(f) go negative or imaginary 
in making a smooth transition from T to 0?). After choosing g(f) > 0, however, the question 
remains of choosing the transmit filter p(t) and the receive filter g(t) subject to p(f)¢(f) = g(f). 
When studying white Gaussian noise later, we will find that ¢(f) should be chosen to equal 
p*(f). Thus’, 


IPA! = lal = Va(h)- (6.19) 


The phase of p(f) can be chosen in an arbitrary way, but this determines the phase of ¢(f) = 
p*(f). The requirement that p(f)¢(f) = g(f) > 0 means that ¢(f) = p*(f). In addition, if p(t) 
is real then p(—f) = p*(f), which determines the phase for negative f in terms of an arbitrary 
phase for f > 0. It is convenient here, however, to be slightly more general and allow p(t) to be 
complex. We will prove the following important theorem: 


Theorem 6.3.2 (Orthonormal shifts). Let p(t) be an Lo function such that g(f) = \p(f)|? 
satisfies the Nyquist criterion for T. Then {p(t—kT); k € Z} is a set of orthonormal functions. 
Conversely, if {p(t—kT); k € Z} is a set of orthonormal functions, then |p(f)|? satisfies the 
Nyquist criterion. 


Proof: Let g(t) = p*(—t). Then g(t) = p(t) * q(t) so that 


(oe) 


kL) = ie p(r)q(kT — 7) dr = i p(t)p*(t — kT) dr. (6.20) 


—co —oo 


If g(f) satisfies the Nyquist criterion, then g(t) is ideal Nyquist and (6.20) has the value 0 for 
each integer & # 0 and has the value 1 for k = 0. By shifting the variable of integration by 


A function p(t) satisfying (6.19) is often called square root of Nyquist, although it is the magnitude of the 
transform that is the square root of the transform of an ideal Nyquist pulse. 
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jT for any integer j in (6.20), we see also that [p(t — jT)p*(7 — (k + 9)T) dr = 0 for k £0 
and 1 for k = 0. Thus {p(t — kT; k € Z} is an orthonormal set. Conversely, assume that 
{p(t — kT); k € Z} is an orthonormal set. Then (6.20) has the value 0 for integer k 4 0 and 1 
for k = 0. Thus g(t) is ideal Nyquist and g(f) satisfies the Nyquist criterion. 


Given this orthonormal shift property for p(t), the PAM transmitted waveform u(t) = 
>>, uep(t—kT) is simply an orthonormal expansion. Retrieving the coefficient ux, then cor- 
responds to projecting u(t) onto the one dimensional subspace spanned by pz. Note that this 
projection is accomplished by filtering u(t) by q(t) and then sampling at time kT. The filter 
q(t) is called the matched filter to p(t). We discuss these filters later when noise is introduced 
into the picture. 


Note that we have restricted the pulse p(t) to have unit energy. There is no loss of generality 
here, since the input signals {u;,} can be scaled arbitrarily and there is no point in having an 
arbitrary scale factor in both places. 


For |p(f)|? = g(f), the actual bandwidth of f(f),¢(f), and g(f) are the same, say B,. Thus if 
By < co, we see that p(t) and q(t) can be realized only with infinite delay, which means that 
both must be truncated. Since q(t) = p*(—t), they must be truncated for both positive and 
negative t. We assume that they are truncated at such a large value of delay that the truncation 
error is negligible. Note that the delay generated by both the transmitter and receiver filter 
(i.e., from the time that uzp(t — kT) starts to be formed at the transmitter to the time when 
ux is sampled at the receiver) is twice the duration of p(t). 


6.3.3 Relation between PAM and analog source coding 


The main emphasis in PAM modulation has been that of converting a sequence of T-spaced 
signals into a waveform. Similarly, the first part of analog source coding is often to convert 
a waveform into a T-spaced sequence of samples. The major difference is that with PAM 
modulation, we have control over the PAM pulse p(t) and thus some control over the class of 
waveforms. With source coding, we are stuck with whatever class of waveforms describes the 
source of interest. 


For both systems the nominal bandwidth is Wy, = 1/(2T’) and By can be defined as the actual 
baseband bandwidth of the waveforms. In the case of source coding, By < Wp» is a necessary 
condition for the sampling appoximation )>), u(kT’) sinc(+—k) to perfectly recreate the waveform 
u(t). The aliasing theorem and the T-spaced sinc weighted sinusoid expansion were used to 
analyze the squared error if By > Wp. 


For PAM, on the other hand, the necessary condition for the PAM demodulator to recreate the 
initial PAM sequence is By > Wy. With By > Wo, aliasing can be used to advantage, creating 
an aggregate pulse g(t) that is ideal Nyquist. There is considerable choice in such a pulse, and 
it is chosen by using contributions from both f < W, and f > W). Finally we saw that the 
transmission pulse p(t) for PAM can be chosen so that its T-spaced shifts form an orthonormal 
set. The sinc functions have this property, but many other waveforms with slightly greater 
bandwidth have the same property but decay much faster with ft. 
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6.4 Modulation: baseband to passband and back 


The discussion of PAM in the previous 2 sections focussed on converting a T-spaced sequence 
of real signals into a real waveform of bandwidth By, slightly larger than the Nyquist bandwidth 
W, = oT This section focuses on converting that baseband waveform into a passband waveform 
appropriate for the physical medium, regulatory constraints, and avoiding other transmission 
bands. 


6.4.1 Double-sideband amplitude modulation 


The objective of modulating a baseband PAM waveform u(t) to some high frequency passband 
around some carrier f, is to simply shift &(f) up in frequency to ti(f)e?™/«t. Thus if @(f) is zero 
except for —By < f < By, then the shifted version would be zero except for f.— By < f < fe+Bo. 
This does not quite work since it results in a complex waveform, whereas only real waveforms 
can actually be transmitted. Thus u(t) is also multiplied by the complex conjugate of e?”’/t, 


i.e., e 2™lct. resulting in the following passband waveform: 
a(t) = u(t)[e?™et + e-27Fet] — Qu(t) cos(2r fit), (6.21) 
z(f) = a(f _ Je) a(f + foals (6.22) 


As illustrated in Figure 6.4, u(t) is both translated up in frequency by f, and also translated down 
by fc. Since x(t) must be real, #(f) = @*(—f), and the negative frequencies cannot be avoided. 
Note that the entire set of frequencies in |— By, By] is both translated up to [—B, + fc, By + fel 
and down to [—By — fe, By — fc]. Thus (assuming f. > By) the range of nonzero frequencies 
occupied by x(t) is twice as large as that occupied by u(t). 


LS af) 


By f 


Kh | Ch f) 


0 f fe-By — fet Bo 


Figure 6.4: Frequency domain representation of a baseband waveform u(t) shifted up to 
a passband around the carrier f,. Note that the baseband bandwidth By of u(t) has been 
doubled to the passband bandwidth B = 2B, of x(t). 


In the communication field, the bandwidth of a system is universally defined as the range of 
positive frequencies used in transmission. Since transmitted waveforms are real, the negative 
frequency part of those waveforms is determined by the positive part and is not counted. This is 
consistent with our earlier baseband usage, where By is the bandwidth of the baseband waveform 
u(t) in Figure 6.4, and with our new usage for passband waveforms where B = 2B, is the 
bandwidth of <(f). 


The passband modulation scheme described by (6.21) is called double-sideband amplitude mod- 
ulation. The terminology comes not from the negative frequency band around —f, and the 
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positive band around f,, but rather from viewing [f.—By, f-+B,] as two sidebands, the upper, 
(fe, fe+By], coming from the positive frequency components of u(t) and the lower, [f-—Bp, fe] 
from its negative components. Since u(t) is real, these two bands are redundant and either could 
be reconstructed from the other. 


Double-sideband modulation is quite wasteful of bandwidth since half of the band is redundant. 
Redundancy is often useful for added protection against noise, but such redundancy is usually 
better achieved through digital coding. 


The simplest and most widely employed solution for using this wasted bandwidth!? is quadra- 
ture amplitude modulation (QAM), which is described in the next section. PAM at passband 
is appropriately viewed as a special case of QAM, and thus the demodulation of PAM from 
passband to baseband is discussed at the same time as the demodulation of QAM. 


6.5 Quadrature amplitude modulation (QAM) 


QAM is very similar to PAM except that with QAM the baseband waveform u(t) is chosen to 
be complex. The complex QAM waveform u(t) is then shifted up to passband as u(t)e?™Je¢. 
This waveform is complex and is converted into a real waveform for transmission by adding its 
complex conjugate. The resulting real passband waveform is then 


a(t) = u(t)e2™ et + ur (tle 27 Fee (6.23) 


Note that the passband waveform for PAM in (6.21) is a special case of this in which u(t) is real. 
The passband waveform x(t) in (6.23) can also be written in the following equivalent ways: 


x(t) 


IWRf{u(t)e2*Fery (6.24) 
2R{u(t)} cos(2af.t) — 2{u(t)} sin(2Qrf.t) . (6.25) 


The factor of 2 in (6.24) and (6.25) is an arbitrary scale factor. Some authors leave it out, (thus 
requiring a factor of 1/2 in (6.23)) and others replace it by V2 (requiring a factor of 1/2 in 
(6.23)). This scale factor (however chosen) causes additional confusion when we look at the 
energy in the waveforms. With the scaling here, ||a||? = 2||u||?. Using the scale factor /2 
solves this problem, but introduces many other problems, not least of which is an extraordinary 
number of \/2’s in equations. At one level, scaling is a trivial matter, but although the literature 
is inconsistent, we have tried to be consistent here. One intuitive advantage of the convention 
here, as illustrated in Figure 6.4 is that the positive frequency part of x(t) is simply u(t) shifted 
up by fe. 

The remainder of this section provides a more detailed explanation of QAM, and thus also 
of a number of issues about PAM. A QAM modulator (see figure 6.5) has the same 3 layers 
as a PAM modulator, i.e., first mapping a sequence of bits to a sequence of complex signals, 
then mapping the complex sequence to a complex baseband waveform, and finally mapping the 
complex baseband waveform to a real passband waveform. 


The demodulator, not surprisingly, performs the inverse of these operations in reverse order, 
first mapping the received bandpass waveform into a baseband waveform, then recovering the 


3 An alternate approach is single-sideband modulation. Here either the positive or negative sideband of a 
double-sideband waveform is filtered out, thus reducing the transmitted bandwidth by a factor of 2. This used to 
be quite popular for analog communication but is harder to implement for digital communication than QAM. 
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sequence of signals, and finally recovering the binary digits. Each of these layers is discussed in 
turn. 


Binary | Signal Baseband Baseband to 
Input —_| encoder modulator passband 


Channel 
Binary | Signal |__| Baseband |__| Passband to 
Output | decoder Demodulator baseband 


Figure 6.5: QAM modulator and demodulator. 


6.5.1 QAM signal set 


The input bit sequence arrives at a rate of R b/s and is converted, b bits at a time, into a 
sequence of complex signals uz chosen from a signal set (alphabet, constellation) A of size 
M = |A| = 2°. The signal rate is thus R, = R/b signals per second, and the signal interval is 
T =1/R, = b/R sec. 


In the case of QAM, the transmitted signals uz, are complex numbers uz € C, rather than real 
numbers. Alternatively, we may think of each signal as a real 2-tuple in R?. 


A standard (M’ x M')-QAM signal set, where M = (M"’)? is the Cartesian product of two 
M'-PAM sets; i.e., 


A= {(a'+ia")|a € Aa" € At, 
where 
A’ = {-d(M' — 1)/2,... ,-d/2,d/2,... ,d(M’ — 1)/2}. 


The signal set A thus consists of a square array of M = (M’)? = 2° signal points located 
symmetrically about the origin, as illustrated below for M = 16. 


The minimum distance between the two-dimensional points is denoted by d. Also the average 
energy per two-dimensional signal, which is denoted F,, is simply twice the average energy per 
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dimension: 


@i(M’)?-1) @iM-1 
Es = = : 
6 6 
In the case of QAM there are clearly many ways to arrange the signal points other than on a 
square grid as above. For example, in an M-PSK (phase-shift keyed) signal set, the signal points 
consist of M equally spaced points on a circle centered on the origin. Thus 4-PSK = 4-QAM. 
For large M it can be seen that the signal points become very close to each other on a circle so 
that PSK is rarely used for large M. On the other hand, PSK has some practical advantages 
because of the uniform signal magnitudes. 


As with PAM, the probability of decoding error is primarily a function of the minimum distance 
d. Not surprisingly, FE, is linear in the signal power of the passband waveform. In wireless 
systems the signal power is limited both to conserve battery power and to meet regulatory 
requirements. In wired systems, the power is limited both to avoid crosstalk between adjacent 
wires and adjacent frequencies, and also to avoid nonlinear effects. 


For all of these reasons, it is desirable to choose signal constellations that approximately minimize 
E, for a given d and M. One simple result here is that a hexagonal grid of signal points achieves 
smaller &, than a square grid for very large M and fixed minimum distance. Unfortunately, 
finding the optimal signal set to minimize E, for practical values of M is a messy and ugly 
problem, and the minima have few interesting properties or symmetries. We will not spend 
further time on this other than a few exercises and will usually simply assume a standard 
(M’' x M’)-QAM signal set, which is almost universally used in practice. 


The standard (M’ x M’)-QAM signal set is almost universally used in practice and will be 
assumed in what follows. 


6.5.2. QAM baseband modulation and demodulation 


A QAM baseband modulator is determined by the signal interval T and a complex £2 waveform 
p(t). The discrete-time sequence {u,} of complex signal points modulates the amplitudes of a 
sequence of time shifts {p(t—kT)} of the basic pulse p(t) to create a complex transmitted signal 
u(t) as follows: 


u(t) = 50 ugp(t-kT). (6.26) 


keZ 


As in the PAM case, we could choose p(t) to be sinc(4), but for the same reasons as before, p(t) 
should decay with increasing |t| faster than the sinc function. This means that p(f) should be 
a continuous function that goes to zero rapidly but not instantaneously as f increases beyond 
1/(2T). As with PAM, we define W, = sp to be the nominal baseband bandwidth of the QAM 
modulator and By to be the actual design bandwidth. 


Assume for the moment that the process of conversion to passband, channel transmission, and 
conversion back to baseband, is ideal, recreating the baseband modulator output u(t) at the 
input to the baseband demodulator. The baseband demodulator is determined by the interval 
T (the same as at the modulator) and an £2 waveform q(t). The demodulator filters u(t) by 
q(t) and samples the output at T-spaced sample times. Denoting the filtered output by 


re= a u(r)q(t — T) dr, 


—oco 
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we see that the received samples are r(T),7r(2T),.... Note that this is the same as the PAM 
demodulator except that real signals have been replaced by complex signals. As before, the 
output r(t) can be represented as 


r(t) = So ugg(t — kT), 
k 


where g(t) is the convolution of p(t) and q(t). As before, r(kT) = ux if g(t) is ideal Nyquist, 
namely if g(0) = 1 and g(kT) = 0 for all nonzero integer k. 


The proof of the Nyquist criterion, Theorem 6.3.1, is valid whether or not g(t) is real. For the 
reasons explained earlier, however, g(f) is usually real and symmetric (as with the raised cosine 
functions) and this implies that g(t) is also real and symmetric. 


Finally, as discussed with PAM, f(f) is usually chosen to satisfy |p(f)| = g(f). Choosing 
p(f) in this way does not specify the phase of p(f), and thus p(f) might be real or complex. 
However p(f) is chosen, subject to |g(f)|? satisfying the Nyquist criterion, the set of time shifts 
{p(t—kT)} form an orthonormal set of functions. With this choice also, the baseband bandwidth 
of u(t), p(t), and g(t) are all the same. Each has a nominal baseband bandwidth given by 3p 
and each has an actual baseband bandwidth that exceeds 7 by some small rolloff factor. As 
with PAM, p(t) and q(t) must be truncated in time to allow finite delay. The resulting filters 
are then not quite bandlimited, but is viewed as a negligible implementation error. 


In summary, QAM baseband modulation is virtually the same as PAM baseband modulation. 
The signal set for QAM is of course complex, and the modulating pulse p(t) can be complex, 
but the Nyquist results about avoiding intersymbol interference are unchanged. 


6.5.3 QAM: baseband to passband and back 


Next we discuss modulating the complex QAM baseband waveform u(t) to the passband wave- 
form «(t). Alternative expressions for x(t) are given by (6.23), (6.24). and (6.25) and the 
frequency representation is illustrated in Figure 6.4. 


As with PAM, u(t) has a nominal baseband bandwidth W, = yp. The actual baseband band- 
width By, exceeds Wy by some small rolloff factor. The corresponding passband waveform 2(t) 
has a nominal passband bandwidth W = 2W, = + and an actual passband bandwidth B = 2By. 
We will assume in everything to follow that B/2 < f.. Recall that u(t) and x(t) are idealized 
approximations of the true baseband and transmitted waveforms. These true baseband and 
transmitted waveforms must have finite delay and thus infinite bandwidth, but it is assumed 
that the delay is large enough that the approximation error is negligible. The assumption! 
B/2 < fc implies that u(t)e?"/! is constrained to positive frequencies and u(t)e~?"F! to nega- 
tive frequencies. Thus the Fourier transform ti(f—f.) does not overlap with u(f+f.). 


As with PAM, the modulation from baseband to passband is viewed as a two step process. 
First u(t) is translated up in frequency by an amount fc, resulting in a complex passband 
waveform at(t) = u(t)e?"Fet. Next at(t) is converted to the real passband waveform x(t) = 


[vt (t)]* + a*(t). 


‘Exercise 6.11 shows that when this assumption is violated, u(t) can not be perfectly retrieved from a(t), even 
in the absence of noise. The negligible frequency components of the truncated version of u(t) outside of B/2 are 
assumed to cause negligible error in demodulation. 
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Assume for now that x(t) is transmitted to the receiver with no noise and no delay. In principle, 
the received x(t) can be modulated back down to baseband by the reverse of the two steps used 
in going from baseband to passband. That is, x(t) must first be converted back to the complex 
positive passband waveform x*(t), and then x*(t) must be shifted down in frequency by fe. 


Mathematically, x*(t) can be retrieved from x(t) simply by filtering x(t) by a complex filter 
h(t) such that h(f) = 0 for f <0 and h(f) = 1 for f > 0. This filter is called a Hilbert filter. 
Note that h(t) is not an Ly function, but it can be converted to Ly by making h(f) have the 
value 0 except in the positive passband [2+ Sis B+ f-| where it has the value 1. We can then 
easily retrieve u(t) from xt (t) simply by a frequency shift. Figure 6.6 illustrates the sequence 
of operations from u(t) to x(t) and back again. 


e2tifet ent ifet 
u(t) at(t z(t) | Hilbert at (t)t u(t) 
>] QK{ -——>} a 
filter 
oo eeenenennn can oeeeemmentll 
transmitter receiver 


Figure 6.6: Baseband to passband and back. 


6.5.4 Implementation of QAM 


From an implementation standpoint, the baseband waveform u(t) is usually implemented as 
two real waveforms, Rt{u(t)} and S{u(t)}. These are then modulated up to passband using 
multiplication by in-phase and out-of-phase carriers as in (6.25), i.e., 


x(t) = 2R{u(t)} cos(2r fet) — 23{u(t)} sin(27 fet). 


There are many other possible implementations, however, such as starting with u(t) given as 
magnitude and phase. The positive frequency expression a+ (t) = u(t)e?"/«! is a complex multi- 
plication of complex waveforms which requires 4 real multiplications rather than the two above 
used to form x(t) directly. Thus going from u(t) to x*(t) to x(t) provides insight but not ease 
of implementation. 


The baseband waveforms R{u(t) } and S{u(t)} are easier to generate and visualize if the modulat- 
ing pulse p(t) is also real. From the discussion of the Nyquist criterion, this is not a fundamental 
limitation, and there are few reasons for desiring a complex p(t). For real p(t), 


Rul} = SI Rukh), 
k 


SS {un} PEA). 


k 


S{u(t)} 
Letting uj, = R{u,} and uf = S{ug}, the transmitted passband waveform becomes 


x(t) = 2cos(27 ft) (= volte) — 2sin(27 ft) ( vine) ; (6.27) 
k 


k 


If the QAM signal set is a standard QAM set, then )°, uj,p(t—kT) and >, ufp(t—kT) are 
parallel baseband PAM systems. They are modulated to passband using “double-sideband” 
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modulation by “quadrature carriers” cos27f.t and —sin2af.t. These are then summed (with 
the usual factor of 2), as shown in Figure 6.7. This realization of QAM is called double-sideband 
quadrature-carrier (DSB-QC) modulation!’. 


cos 27 fet 


{uj} y- ui,6(t-kT) filter >, U,p(t—-kT) 
i p(t) 


. x(t) 


—sin 27 fet 
yu 5(t-kT) filter Dog UgP(t—kT) : 
k 


p(t) 


(uf 


Figure 6.7: DSB-QC modulation 


We have seen that u(t) can be recovered from x(t) by a Hilbert filter followed by shifting down 
in frequency. A more easily implemented but equivalent procedure starts by multiplying «(t) 
both by cos(27f-t) and by —sin(2z7 f-t). 
Using the trigonometric identities 2cos?(a@) = 1 + cos(2a), 2sin(a)cos(a) = sin(2a), and 
2sin?(a) = 1 — cos(2a), these terms can be written as 
x(t) cos(2rfct) = R{u(t)} + R{u(t)} cos(4r ft) + S{u(t)} sin(4r ft), (6.28) 
—ax(t) sin(27 fet) S{u(t)} — R{u(t)} sin(47 fet) + S{u(t)} cos(4z fet). (6.29) 


To interpret this, note that multiplying by cos(2m fet) = 5e?™Fe! + Se~2"*Fet both shifts x(t) up! 
and down in frequency by f.. Thus the positive frequency part of x(t) gives rise to a baseband 
term and a term around 2/f,, and the negative frequency part gives rise to a baseband term and a 
term at —2f.. Filtering out the double frequency terms then yields R{u(t)}. The interpretation 
of the sine multiplication is similar. 


As another interpretation, recall that x(t) is real and consists of one band of frquencies around 
f- and another around —f,. Note also that (6.28) and (6.29) are the real and imaginary parts 
of a(t)e~?*Jet, which shifts the positive frequency part of 2(t) down to baseband and shifts the 
negative frequency part down to a band around —2f,.. In the Hilbert filter approach, the lower 
band is filtered out before the frequency shift, and in the approach here, it is filtered out after 
the frequency shift. Clearly the two are equivalent. 


It has been assumed throughout that f, is greater than the baseband bandwidth of u(t). If this 
is not true, then, as shown in Exercise 6.11, u(t) can not be retrieved from x(t) by any approach. 


Now assume that the baseband modulation filter p(t) is real and a standard QAM signal set is 
used. Then R{u(t)} = So up,p(t—kT) and S{u(t)} = do ulp(t—kT) are parallel baseband PAM 


The terminology comes from analog modulation where two real analog waveforms are modulated respectively 
onto cosine and sine carriers. For analog modulation, it is customary to transmit an additional component of 
carrier from which timing and phase can be recovered. As we see shortly, no such additional carrier is necessary 
here. 

l6This shift up in frequency is a little confusing, since 2(t)e~?"*4e' = x(t) cos(27 fet) — ix(t) sin(27m fet) is only a 
shift down in frequency. What is happening is that a(t) cos(27fet) is the real part of x(t)e~?*"/«' and thus needs 
positive frequency terms to balance the negative frequency terms. 
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modulations. Assume also that a receiver filter q(t) is chosen so that g(f) = p(f)q(f) satisfies 
the Nyquist criterion and all the filters have the common bandwidth By < f-. Then, from 
(6.28), if x(t) cos(27 ft) is filtered by q(t), it can be seen that q(t) will filter out the component 
around 2f,. The output from the remaining component, R{u(t)} can then be sampled to retrieve 
the real signal sequence u',u4,.... This plus the corresponding analysis of —a(t) sin(27 ft) is 
illustrated in the DSB-QC receiver in Figure 6.8. Note that the use of the filter q(t) eliminates 
the need for either filtering out the double frequency terms or using a Hilbert filter. 


cos 27 fet 


receive filter T spaced {uj} 
q(t) sampler 


— sin 27 fet 
receive filter T spaced {uz} 
q(t) sampler 


Figure 6.8: DSB-QC demodulation 


The above description of demodulation ignores the noise. As explained in Section 6.3.2, however, 
if p(t) is chosen so that {p(t—kT); k € Z} is an orthonormal set (i.e., so that |p(f)|? satisfies 
the Nyquist criterion), then the receiver filter should satisfy q(t) = p(—t). It will be shown later 
that in the presence of white Gaussian noise, this is the optimal thing to do (in a sense to be 
described later). 


6.6 Signal space and degrees of freedom 


Using PAM, real signals can be generated at T-spaced intervals and transmitted in a baseband 
bandwidth arbitrarily little more than W, = oT: Thus, over an asymptotically long interval Top, 
and in a baseband bandwidth asymptotically close to Wy, 2W,7p real signals can be transmitted 
using PAM. 


Using QAM, complex signals can be generated at T-spaced intervals and transmitted in a pass- 
band bandwidth arbitrarily little more than W = rt Thus, over an asymptotically long interval 
To, and in a passband bandwidth asymptotically close to W, WZ complex signals, and thus 
2WTp real signals can be transmitted using QAM. 


The above description described PAM at baseband and QAM at passband. To get a better com- 
parison of the two, consider an overall large baseband bandwidth Wo broken into m passbands 
each of bandwidth Wo/m. Using QAM in each band, we can asymptotically transmit 2Wo7o real 
signals in a long interval Jo. With PAM used over the entire band Wo, we again asyptotically 
send 2WoTo real signals in a duration 7p. We see that in principle, QAM and baseband PAM 
with the same overall bandwidth are equivalent in terms of the number of degrees of freedom 
that can be used to transmit real signals. As pointed out earlier, however, PAM when modulated 
up to passband uses only half the available degrees of freedom. Also, QAM offers considerably 
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more flexibility since it can be used over an arbitrary selection of frequency bands. 


Recall that when we were looking at T-spaced truncated sinusoids and T-spaced sinc weighted 
sinusoids, we argued that the class of real waveforms occupying a time interval (—To/2, To/2) 
and a frequency interval (—Wo, Wo) has about 2T)Wo degrees of freedom for large Wo, To. What 
we see now is that baseband PAM and passband QAM each employ about 27 Wo degrees of 
freedom. In other words, these simple techniques essentially use all the degrees of freedom 
available in the given bands. 


The use of Nyquist theory here has added to our understanding of waveforms that are “essen- 
tially” time and frequency limited. That is, we can start with a family of functions that are 
bandlimited within a rolloff factor and then look at asymptotically small rolloffs. The discussion 
of noise in the next two chapters will provide a still better understanding of degrees of freedom 
subject to essential time and frequency limits. 


6.6.1 Distance and orthogonality 


Previous sections have shown how to modulate a complex QAM baseband waveform u(t) up to 
areal passband waveform «(t) and how to retrieve u(t) from x(t) at the receiver. They have also 
discussed signal constellations that minimize energy for given minimum distance. Finally, the 
use of a modulation waveform p(t) with orthonormal shifts, has connected the energy difference 
between two baseband signal waveforms, say u(t) = }> ugp(t — kT) and v(t) = 0, ugp(t — kt) 
and the energy difference in the signal points by 


Ju — ol]? = SO lux — vel? 
k 


Now consider this energy difference at passband. The energy ||z/||? in the passband waveform 
x(t) is twice that in the corresponding baseband waveform u(t). Next suppose that x(t) and 
y(t) are the passband waveforms arising from the baseband waveforms u(t) and u(t) respectively. 
Then 


x(t) — y(t) = WRfu(te*™4"} — W{u(tjer™e"} = WR{[u(t)—v(Her™i}. 
Thus x(t) — y(t) is the passband waveform corresponding to u(t) — u(t), so 


z(t) — y@I|? = 2llu(t) — o(¢)|". 


This says that for QAM and PAM, distances between waveforms are preserved (aside from the 
scale factor of 2 in energy or \/2 in distance) in going from baseband to passband. Thus distances 
are preserved in going from signals to baseband waveforms to passband waveforms and back. 
We will see later that the error probability caused by noise is essentially determined by the 
distances between the set of passband source waveforms. This error probability is then simply 
related to the choice of signal constellation and the discrete coding that precedes the mapping 
of data into signals. 


This preservation of distance through the modulation to passband and back is a crucial aspect 
of the signal space viewpoint of digital communication. It provides a practical focus to viewing 
waveforms at baseband and passband as elements of related £2 inner product spaces. 


There is unfortunately a mathematical problem in this very nice story. The set of baseband 
waveforms forms a complex inner product space whereas the set of passband waveforms consti- 
tutes a real inner product space. The transformation a(t) = R{u(t)e?"/<t} is not linear, since, 
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for example, iu(t) does not map into ix(t) for u(t) € 0). In fact, the notion of a linear trans- 
formation does not make much sense, since the transformation goes from complex £2 to real £Lo 
and the scalars are different in the two spaces. 


Example 6.6.1. As an important example, suppose the QAM modulation pulse is a real wave- 
form p(t) with orthonormal T-spaced shifts. The set of complex baseband waveforms spanned by 
the orthonormal set {p(t—kT); k € Z} has the form )°, upp(t — kT’) where each ux is complex. 
As in (6.27), this is transformed at passband to 


So ugp(t — kT) > S > 2R{ug}p(t — kT) cos(2n ft) — 2S ° S{ug}p(t — kT) sin(2n ft). 
k k k 

Each baseband function p(t—kT) is modulated to the passband waveform is 2p(t—kT) cos(27 fet). 

The set of functions {p(t—kT) cos(27f.t); k € Z} is not enough to span the space of modulated 

waveforms, however. It is necessary to add the additional set {p(t—kT) sin(27f.t); k € Z}. As 

shown in Exercise 6.15, This combined set of waveforms is an orthogonal set, each of energy 2. 


Another way to look at this example is to observe that modulating the baseband function 
u(t) into the positive passband function x+(t) = u(t)e?"/e+ is somewhat easier to under- 
stand in that the orthonormal set {p(t—kT); k € Z} is modulated to the orthonormal set 
{p(t—kT )e2™Fet. k € Z}, which can be seen to span the space of complex positive frequency pass- 
band source waveforms. The additional set of orthonormal waveforms {p(t—kT)e~?™ ft; k € Z} 
is then needed to span the real passband source waveforms. We then see that the sine, cosine 
series is simply another way to express this. In the sine, cosine formulation all the coefficients in 
the series are real, whereas in the complex exponential formulation, there is a real and complex 
coefficient for each term, but they are pairwise dependent. It will be easier to understand the 
effects of noise in the sine, cosine formulation. 


In the above example, we have seen that each orthonormal function at baseband gives rise to 
two real orthonormal functions at passband. It can be seen from a degrees of freedom argument 
that this is inevitable no matter what set of orthonormal functions are used at baseband. For a 
nominal passband bandwidth W, there are 2W real degrees of freedom per second in the baseband 
complex source waveform, which means there 2 real degrees of freedom for each orthonormal 
baseband waveform. At passband, we have the same 2W degrees of freedom per second, but 
with a real orthonormal expansion, there is only one real degree of freedom for each orthonormal 
waveform. Thus there must be two passband real orthonormal waveforms for each baseband 
complex orthonormal waveform. 


The sine, cosine expansion above generalizes in a nice way to an arbitrary set of complex or- 
thonormal baseband functions. Each complex function in this baseband set generates two real 
functions in an orthogonal passband set. This is expressed precisely in the following theorem 
which is proven in Exercise 6.16. 


Theorem 6.6.1. Let {0,(t) : k € Z} be an orthonormal set limited to the frequency band 
[—B/2, B/2]. Let f. be greater than B/2 and for each k € Z let 


drat) R { 26, (t) eonitet} | 
drat) = 9 { 26. (t) ene 


The set {Wejik € Z,j € {1,2}} is an orthogonal set of functions, each of energy 2. Furthermore, 
if u(t) = >>, ux6,(t), then the corresponding passband function x(t) = 2R{u(t)e?™Fe"} is given 
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by 


x(t) = D7) Rue} vert) + {un} ve,2(t)- 
k 


This gives us a very general way to map any orthonormal set at baseband into a related or- 
thonormal set at passband, with two real orthonormal functions at passband corresponding to 
each orthonormal function at baseband. It is not limited to any particular type of modulation, 
and thus will allow us to make general statements about signal space at baseband and passband. 


6.7 Carrier and phase recovery in QAM systems 


Consider a QAM receiver and visualize the passband-to-baseband conversion as multiplying the 
positive frequency passband by the complex sinusoid e~2*’fct, If the receiver has a phase error 
g(t) in its estimate of the phase of the transmitted carrier, then it will instead multiply the 
incoming waveform by e~27fet+'#(), We assume in this analysis that the time reference at the 
receiver is perfectly known, so that the sampling of the filtered output is done at the correct 
time. Thus the assumption is that the oscillator at the receiver is not quite in phase with the 
oscillator at the transmitter. Note that the carrier frequency is usually orders of magnitude 
higher than the baseband bandwidth, and thus a small error in timing is significant in terms 
of carrier phase but not in terms of sampling. The carrier phase error will rotate the correct 
complex baseband signal u(t) by ¢(t); z.e., the actual received baseband signal r(t) will be 


r(t) = e*Mu(e), 


If #(t) is slowly time-varying relative to the response q(t) of the receiver filter, then the samples 
{r(kT)} of the filter output will be 


r(kT) & PFD ayy, 


as illustrated in Figure 6.9. The phase error ¢(t) is said to come through coherently. This phase 
coherence makes carrier recovery easy in QAM systems. 


Figure 6.9: Rotation of constellation points by phase error 


As can be seen from the figure, if the phase error is small enough, and the set of points in the 
constellation are well enough separated, then the phase error can be simply corrected by moving 
to the closest signal point and adjusting the phase of the demodulating carrier accordingly. 


There are two complicating factors here. The first is that we have not taken noise into account 
yet. When the received signal y(t) is a(t) + n(t), then the output of the T spaced sampler is not 
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the original signals {u,}, but rather a noise corrupted version of them. The second problem is 
that if a large phase error ever occurs, it can not be corrected. For example, in Figure 6.9, if 
o(t) = 7/2, then even in the absence of noise, the received samples always line up with signals 
from the constellation (but of course not the transmitted signals). 


6.7.1 Tracking phase in the presence of noise 


The problem of deciding on or detecting the signals {u;} from the received samples {r(kT’)} 
in the presence of noise is a major topic of Chapter 8. Here, however, we have the added 
complication of both detecting the transmitted signals and also tracking and eliminating the 
phase error. 


Fortunately, the problem of decision making and that of phase tracking are largely separable. 
The oscillators used to generate the modulating and demodulating carriers are relatively stable 
and have phases which change quite slowly relative to each other. Thus the phase error with 
any kind of reasonable tracking will be quite small, and thus the data signals can be detected 
from the received samples almost as if the phase error were zero. The difference between the 
received sample and the detected data signal will still be nonzero, mostly due to noise but partly 
due to phase error. However, the noise has zero mean (as we understand later) and thus tends 
to average out over many sample times. Thus the general approach is to make decisions on the 
data signals as if the phase error is zero, and then to make slow changes to the phase based on 
averaging over many sample times. This approach is called decision directed carrier recovery. 
Note that if we track the phase as phase errors occur, we are also tracking the carrier, in both 
frequency and phase. 


In a decision directed scheme, assume that the received sample r(kT’) is used to make a decision 
dy, on the transmitted signal point uz. Also assume that dy, = uz with very high probability. 
The apparent phase error for the kth sample is then the difference between the phase of r(kT) 
and the phase of d,;. Any method for feeding back the apparent phase error to the generator of 
the sinusoid e~?7‘fe'+#@ in such a way as to slowly reduce the apparent phase error will tend 
to produce a robust carrier recovery system. 


In one popular method, the feedback signal is taken as the imaginary part of r(kT)dj. If the 
phase angle from dz to r(kT) is dz, then 


r(kT) dy, = |r(KT)||dx| e'**, 


so the imaginary part is |r(kT)||d,| sin d, © |r(kT)||de|Gx, when dz is small. Decision-directed 
carrier recovery based on such a feedback signal can be extremely robust even in the presence 
of substantial distortion and large initial phase errors. With a second-order phase-locked carrier 
recovery loop, it turns out that the carrier frequency f, can be recovered as well. 


6.7.2 Large phase errors 


A problem with decision-directed carrier recovery and with many other approaches is that the 
recovered phase may settle into any value for which the received eye pattern (7.e., the pattern of 
a long string of received samples as viewed on a scope) “looks OK.” With (MV x M)-QAM signal 
sets, as in Figure 6.9, the signal set has four-fold symmetry, and phase errors of 90°, 180°, or 270° 
are not detectable. Simple differential coding methods that transmit the “phase” (quadrantal) 
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part of the signal information as a change of phase from the previous signal rather than as an 
absolute phase can easily overcome this problem. Another approach is to resynchronize the 
system frequently by sending some known pattern of signals. This latter approach is frequently 
used in wireless systems where fading sometimes causes a loss of phase synchronization. 


6.8 Summary of modulation and demodulation 


This chapter has used the signal space developed in Chapters 4 and 5 to study the mapping of 
binary input sequences at a modulator into the waveforms to be transmitted over the channel. 
Figure 6.1 summarized this process, mapping bits to signals, then signals to baseband waveforms, 
and then baseband waveforms to passband waveforms. The demodulator goes through the 
inverse process, going from passband waveforms to baseband waveforms to signals to bits. This 
breaks the modulation process into three layers that can be studied more or less independently. 


The development used PAM and QAM throughout, both as widely used systems, and as conve- 
nient ways to bring out the principles that can be applied more widely. 


The mapping from binary digits to signals segments the incoming binary sequence into b-tuples 
of bits and then maps the set of M = 2° n-tuples into a constellation of M signal points in R™ 
or C™ for some convenient m. Since the m components of these signal points are going to be 
used as coefficients in an orthogonal expansion to generate the waveforms, the objectives are to 
choose a signal constellation with small average energy but with a large distance between each 
pair of points. PAM is an example where the signal space is R! and QAM is an example where 
the signal space is C!. For both of these, the standard mapping is the same as the representation 
points of a uniform quantizer. These are not quite optimal in terms of minimizing the average 
energy for a given minimum point spacing, but they are almost universally used because of the 
near-optimality and the simplicity. 


The mapping of signals into baseband waveforms for PAM chooses a fixed waveform, p(t) and 
modulates the sequence of signals ui, u2,... into the baseband waveform > . ujp(t — jT). One 
of the objectives in choosing p(t) is to be able to retrieve the sequence uj, u2,..., from the 
received waveform. This involves an output filter q(t) which is sampled each T seconds to 
retrieve u,, U2,.... The Nyquist criterion was derived, specifying the properties that the product 
g(f) = b(f)G(f) must satisfy to avoid intersymbol interference. The objective in choosing g(f) 
is a trade off between the closeness of g(f) to T rect( fT) and the time duration of g(t), subject 
to satisfying the Nyquist criterion. The raised cosine functions are widely used as a good 
compromise between these dual objectives. For a given real g(f), the choice of p(f) usually 
satisfies g(f) = |p(f)|?, and in this case {p(t — kT); k € Z} is a set of orthonormal functions. 


Most of the remainder of the chapter discussed modulation from baseband to passband. This 
was primarily a topic in the manipulation of Fourier transforms, and need not be summarized 
here. 
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6.E Exercises 


6.1. (PAM) Consider standard M-PAM and assume that the signals are used with equal prob- 
ability. Show that the average energy per signal EF, = U2 is equal to the average energy 
U? = d?M?/12 of a uniform continuous distribution over the interval [—dM/2,dM/2], mi- 
nus the average energy (U — U;)2 = d?/12 of a uniform continuous distribution over the 
interval [—d/2, d/2): 
d?(M? —1) 


LS 
3 12 


This establishes (6.4). Verify the formula for M = 4 and M =8. 


6.2. (PAM) A discrete memoryless source emits binary equiprobable symbols at a rate of 1000 
symbols per second. The symbols from a one second interval are grouped into pairs and 
sent over a bandlimited channel using a standard 4-PAM signal set. The modulation uses 
a signal interval 0.002 and pulse p(t) = sinc(t/T). 

(a) Suppose that a sample sequence wj,... ,U500 of transmitted signals includes 115 ap- 
pearances of 3d/2, 130 appearances of d/2, 120 appearances of —d/2, and 135 appear- 
ances of —3d/2. Find the energy in the corresponding transmitted waveform u(t) = 
Se sinc(4—k) as a function of d. 

(b) What is the bandwidth of the waveform u(t) in part (a)? 

(c) Find E | f U?(¢) dt] where U(t) is the random waveform Peer sinc(4—k). 

(d) Now suppose that the binary source is not memoryless, but is instead generated by a 
Markov chain where 


13440), Fl | Agw=l) = Prix=0 | KG 4=0) = 0.9. 


Assume the Markov chain starts in steady state with Pr(X,;=1) = 1/2. Using the mapping 
(00 — az), (01 — az), (10 — a3), (11 — aa), find E[U?] for 1 < k < 500. 

(e) Find E | f U?(t) dt] for this new source. 

(f) For the above Markov chain, explain how we could change the above mapping to reduce 
the expected energy without changing the separation between signal points. 


6.3. (a) Assume that the received signal in a 4-PAM system is V;, = U; + Z, where Us, is the 
transmitted 4-PAM signal at time k. Let Z, be independent of U; and Gaussian with 
density fz(z) = Jt exp {-#}. Assume that the receiver chooses the signal U; closest 
to Vz. (It is shown in Chapter 8 that this detection rule minimizes P. for equiprobable 
signals.) Find the probability P, (in terms of Gaussian integrals) that U, 4 Up. 

(b) Evaluate the partial derivitive of Pe with respect to the third signal point a3 (7.e., the 
positive inner signal point) at the point where ag is equal to its value d/2 in standard 
4-PAM and all other signal points are kept at their 4-PAM values. Hint: This doesn’t 
require any calculation. 

(c) Evaluate the partial derivitive of the signal energy E, with respect to a3. 

(d) Argue from this that the minimum error probability signal constellation for 4 equiprob- 
able signal points is not 4-PAM, but rather a constellation where the distance between the 
inner points is smaller than the distance from inner point to outer point on either side. 
(This is quite surprising intuitively to the author.) 
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6.4. (Nyquist) Suppose that the PAM modulated baseband waveform u(t) = S772... u¢p(t—kT) 
is received. That is, u(t) is known, T is known, and p(t) is known. We want to determine 
the signals {u,} from u(t). We assume we must use only linear operations. That is, we 
wish to find some waveform d;,(t) for each integer k such that [°°. u(t)d,(t) dt = ur. 

(a) What properites must be satisfied by d;(t) such that the above equation is satisfied no 
matter what values are taken by the other signals, ... , up_2, Up_1, Uk41, Uk42,---? These 
properties should take the form of constraints on the inner products (p(t — kT’), d;(t)). Do 
not worry about convergence, interchange of limits, etc. 

(b) Suppose you find a function do(t) that satisfies these constraints for k = 0. Show that 
for each k, a function d;(t) satisfying these constraints can be found simply in terms of 
do(t). 

(c) What is the relationship between do(t) and a function q(t) that avoids intersymbol 
interference in the approach taken in Section 6.3 (7.e., a function q(t) such that p(t) * q(t) 
is ideal Nyquist). 

You have shown that the filter/sample approach in Section 6.3 is no less general than the 
arbitrary linear operation approach here. Note that, in the absence of noise and with a 
known signal constellation, it might be possible to retrieve the signals from the waveform 
using nonlinear operations even in the presence of intersymbol interference. 

6.5. (Nyquist) Let v(t) be a continuous £2 waveform with v(0) = 1 and define g(t) = 
v(t) sinc(4). 

(a) Show that g(t) is ideal Nyquist with interval T. 

(b) Find g(f) as a function of o(f). 

(c) Give a direct demonstration that g(f) satisfies the Nyquist criterion. 

(d) If v(t) is baseband limited to By, what is g(t) baseband limited to? 

Note: The usual form of the Nyquist criterion helps in choosing waveforms that avoid 
intersymbol interference with prescribed rolloff properties in frequency. The approach 
above show how to avoid intersymbol interference with prescribed attenuation in time 
and in frequency. 

6.6. (Nyquist) Consider a PAM baseband system in which the modulator is defined by a signal 
interval T and a wveform p(t), the channel is defined by a filter h(t), and the receiver is 
defined by a filter g(t) which is sampled at T-spaced intervals. The received waveform, after 
the receive filter q(t), is then given by r(t) = )°, ugg(t— kT) where g(t) = p(t) * h(t) * q(t). 
(a) What property must g(t) have so that r(kT’) = ux for all k and for all choices of input 
{ux}? What is the Nyquist criterion for g(f)? 

(b) Now assume that T = 1/2 and that p(t), h(t), q(t) and all their Fourier transforms are 
restricted to be real. Assume further that p(f) and h(f) are given by 


< . 
- Se . . oe 1 
p(f)=4 15-t, 05<|f| <15 1G ee ee < 
0 [f| > 1.5 1, 1<|f|<1.25 
Op, (fs 
= hf) 
003 2 0 OA 
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Is it possible to choose a receive filter transform g(f) so that there is no intersymbol 
interference? If so, give such a G(f) and indicate the regions in which your solution is 
nonunique. 


(c) Redo part (b) with the modification that now h(f) = 1 for |f| < 0.75 and h(f) = 0 for 
| f| > 0.75. 


(d) Explain the conditions on p(f)h(f) under which intersymbol interference can be avoided 
by proper choice of g(f) (you may assume, as above, that p(f),h(f), p(t), and A(t) are all 
real). 


6.7. (Nyquist) Recall that the rect(t/T) function has the very special property that it, plus its 
time and frequency shifts by kT and j/T respectively, form an orthogonal set of functions. 
The function sinc(t/T) has this same property. This problem is about some other functions 
that are generalizations of rect(t/T) and which, as you will show in parts (a) to (d), have 
this same interesting property. For simplicity, choose T to be 1. 


These functions take only the values 0 and 1 and are allowed to be nonzero only over [-1, 
1] rather than [—1/2,1/2] as with rect(t). Explicitly, the functions considered here satisfy 
the following constraints: 


p(t) = p(t) for allt (0/1 property) (6.30) 
p(t) = 0 for |t| > 1 (6.31) 
p(t) = p(t) for allt (symmetry) (6.32) 
p(t) = 1-—p(t-1) for 0 <¢ <17/2, (6.33) 


Note: Because of property (6.32), condition (6.33) also holds for 1/2 < t < 1. Note also 
that p(t) at the single points t = +1/2 does not effect any orthogonality properties, so you 
are free to ignore these points in your arguments. 


1 another choice 
rect(t) of p(t) that 
satisfies (1) to (4). 
| | | | 
19252 — = 2 . 2 Spa a2 


(a) Show that p(t) is orthogonal to p(t—1). Hint: evaluate p(t)p(t—1) for each t € [0,1] 
other than t = 1/2. 


(b) Show that p(t) is orthogonal to p(t—k) for all integer k 4 0. 

(c) Show that p(t) is orthogonal to p(t—k)e?™"™ for integer m #0 and k £0. 

(d) Show that p(t) is orthogonal to p(t)e?™"™ for integer m ¢ 0. Hint: Evaluate 
p(t) e72rimt “i pile eee: 

(e) Let h(t) = p(t) where p(f) is the Fourier transform of p(t). If p(t) satisfies properties 


(1) to (4), does it follow that h(t) has the property that it is orthogonal to h(t — k)e?™™* 
whenever either the integer & or m is nonzero? 


Note: Almost no calculation is required in this exercise. 


6.8. (Nyquist) (a) For the special case a = 1, T = 1, verify the formula in (6.18) for gi(f) given 
gi(t) in (6.17). Hint: As an intermediate step, verify that gi(t) = sinc(2t) + 5 sinc(2t+1)+ 
5 sinc(2t — 1). Sketch gi(t), in particular showing its value at mT'/2 for each m > 0. 
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(b) For the general case 0 < a < 1, T = 1, show that g,(f) is the convolution of rect f 
with a half cycle of Gcosmaf and specify the required value of (. 
(c) Verify (6.18) for 0 <a <1, T =1 and then verify for arbitrary T > 0. 


6.9. (Approximate Nyquist)This exercise shows that approximations to the Nyquist criterion 
must be treated with great care. Define g,(f), for integer k > 0 as in the diagram below 
for k = 2. For arbitrary k, there are k small pulses on each side of the main pulse, each of 


height i. F 


yn 1 1 3 3. «7 
0 4 2 G 2 


4 2 


(a) Show that 9x(f) satisfies the Nyquist criterion for T = 1 and for each k > 1. 

(b) Show that Li.m., |. 9x(f) is simply the central pulse above. That is, this £2 limit 
satisfies the Nyquist criterion for T’ = 5: To put it another way, 9,(f), for large k, satisfies 
the Nyquist criterion for JT = 1 using ‘approximately’ the bandwidth i rather than the 
necessary bandwidth $. The problem is that the £2 notion of approximation (done carefully 
here as a limit in the mean of a sequence of approximations) is not always appropriate, and 
it is often inappropriate with sampling issues. 

6.10. (Nyquist) (a) Assume that p(f) = @*(f) and g(f) = p(f)d(f). Show that if p(t) is real, 

then g(f) = g(—f) for all f. 
(b) Under the same assumptions, find an example where p(t) is not real but g(f) 4 9(—f) 
and g(f) satisifes the Nyquist criterion. Hint: Show that g(f) = 1 for 0 < f <1 and 
9(f) = 0 elsewhere satisfies the Nyquist criterion for T = 1 and find the corresponding 
p(t). 

6.11. (Passband) (a) Let uz(t) = exp(27if,t) for k = 1,2 and let xz(t) = 2R{uz(t) exp(Qri fet) }. 
Assume f; > —f. and find the fg 4 fi such that x(t) = x(t). 

(b) Explain that what you have done is to show that, without the assumption that the 
bandwidth of u(t) is less than f., it is impossible to always retrieve u(t) from x(t), even in 
the absence of noise. 

(c) Let y(t) be a real £2 function. Show that the result in part (a) remains valid if 
ug(t) = y(t) exp(27if;t) (i.e., show that the result in part (a) is valid with a restriction to 
£L» functions. 

(d) Show that if u(t) is restricted to be real, then u(t) can be retrieved almost everywhere 
from x(t) = 2R{u(t) exp(277f.t)}. Hint: express x(t) in terms of cos(27fct). 

(e) Show that if the bandwidth of u(t) exceeds f., then neither Figure 6.6 nor Figure 6.8 
work correctly, even when u(t) is real. 

6.12. (QAM) (a) Let 6: (t) and 42(t) be orthonormal complex waveforms. Let $;(t) = 0; (t)e?"/! 
for 7 = 1,2. Show that ¢;(t) and ¢2(t) are orthonormal for any fe. 

(b) Suppose that 02(t) = 6;(t—T). Show that ¢2(t) = ¢1(t—T) if f. is an integer multiple 
ory. 

6.13. (QAM) (a) Assume B/2 < f.. Let u(t) be a real function and let v(t) be an imaginary 
function, both baseband limited to B/2. Show that the corresponding passband functions, 
Rfu(te2™Fed and R{v(t)e?™/t} are orthogonal. 

(b) Give an example where the functions in part (a) are not orthogonal if B/2 > fe. 
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6.14. (a) Derive (6.28) and (6.29) using trigonometric identities. 
(b) View the left side of (6.28) and (6.29) as the real and imaginary part respectively of 
ax(t)e—2™Fet, Rederive (6.28) and (6.29) using complex exponentials. (Note how much easier 
this is than part (a). 


6.15. (Passband expansions) Assume that {p(t—kT) : kEZ} is a set of orthonormal functions. 
Assume that p(f) = 0 for |f| > fe). 
(a) Show that {./2p(t—kT) cos(27 ft); kK€Z} is an orthonormal set. 


(b) Show that {/2p(t—kT) sin(27f,t); k€Z} is an orthonormal set and that each function 
in it is orthonormal to the cosine set in part (a). 


6.16. (Passband expansions) Prove Theorem 6.6.1. Hint: First show that the set of functions 
{Wri(f)} and {vx,o(f)} are orthogonal with energy 2 by comparing the integral over neg- 
ative frequencies with that over positive frequencies. Indicate explicitly why you need 
fe Bye. 


6.17. (Phase and envelope modulation) This exercise shows that any real passband waveform 
can be viewed as a combination of phase and amplitude modulation. Let x(t) be an Lo 
real passband waveform of bandwidth B around a carrier frequency f. > B/2. Let xt (t) 
be the positive frequency part of x(t) and let u(t) = x*(t) exp{—27i fit}. 
(a) Express x(t) in terms of R{u(t)}, S{u(t)}, cos|2a fet], and sin[27 fet]. 
(b) Define ¢(t) implicitly by e’® = a Show that x(t) can be expressed as x(t) = 
2|u(t)| cos[27 f.t + O(t)]. Draw a sketch illustrating that 2|u(t)| is a baseband waveform 
upper-bounding x(t) and touching x(t) roughly once per cycle. Either by sketch or words, 
illustrate that ¢(t) is a phase modulation on the carrier. 


(c) Define the envelope of a passband waveform x(t) as twice the magnitude of its positive 
frequency part, 7.e., as 2|x*(t)|. Without changing the waveform x(t) (or 2*(t)) from that 
before, change the carrier frequency from f, to some other frequency f’. Thus u’(t) = 
x*(t)exp{—27if/t}. Show that |2*(t)| = |u(t)| = |u’(t)|. Note that you have shown that 
the envelope does not depend on the assumed carrier frequency, but has the interpretation 
of part (b). 

(d) Show the relationship of the phase ¢’(t) for the carrier f! to that for the carrier fe. 
(e) Let p(t) = |x(t)|? be the power in x(t). Show that if p(t) is lowpass filtered to bandwidth 
B, the result is 2|u(t)|?. Interpret this filtering as a short-term time average over |x(t)|? to 
interpret why the envelope squared is twice the short-term average power (and thus why 
the envelope is 2 times the short-term root-mean-squared amplitude). 


6.18. (Carrierless amplitude-phase modulation (CAP)) We have seen how to modulate a base- 

band QAM waveform up to passband and then demodulate it by shifting down to baseband, 
followed by filtering and sampling. This exercise explores the interesting concept of elim- 
inating the baseband operations by modulating and demodulating directly at passband. 
This approach is used in one of the North American standards for Asymmetrical Digital 
Subscriber Loop (ADSL) 
(a) Let {uz} be a complex data sequence and let u(t) = 5°, ux p(t— kT) be the correspond- 
ing modulated output. Let p(f) be equal to VT over f € [3/(2T),5/(2T)] and be equal to 
0 elsewhere. At the receiver, u(t) is filtered using p(t) and the output y(t) is then T-space 
sampled at time instants kT. Show that y(kT) = ux for all k € Z. Don’t worry about the 
fact that the transmitted waveform u(t) is complex. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


198 CHAPTER 6. CHANNELS, MODULATION, AND DEMODULATION 


(b) Now suppose that p(f) = VT rect(T(f — f.)] for some arbitrary f, rather than f, = 2/T 
as in part (a). For what values of f. does the scheme still work? 

(c) Suppose that R{u(t)} is now sent over a communication channel. Suppose that the 
received waveform is filtered by a Hilbert filter before going through the demodulation 
procedure above. Does the scheme still work? 
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Chapter 7 


Random processes and noise 


7.1 Introduction 


Chapter 6 discussed modulation and demodulation, but replaced any detailed discussion of the 
noise by the assumption that a minimal separation is required between each pair of signal points. 
This chapter develops the underlying principles needed to understand noise, and the next chapter 
shows how to use these principles in detecting signals in the presence of noise. 


Noise is usually the fundamental limitation for communication over physical channels. This 
can be seen intuitively by accepting for the moment that different possible transmitted wave- 
forms must have a difference of some minimum energy to overcome the noise. This difference 
reflects back to a required distance between signal points, which along with a transmitted power 
constraint, limits the number of bits per signal that can be transmitted. 


The transmission rate in bits per second is then limited by the product of the number of bits per 
signal times the number of signals per second, 7.e., the number of degrees of freedom per second 
that signals can occupy. This intuitive view is substantially correct, but must be understood at 
a deeper level which will come from a probabilistic model of the noise. 


This chapter and the next will adopt the assumption that the channel output waveform has the 
form y(t) = x(t) + z(t) where x(t) is the channel input and z(t) is the noise. The channel input 
x(t) depends on the random choice of binary source digits, and thus x(t) has to be viewed as a 
particular selection out of an ensemble of possible channel inputs. Similarly, z(t) is a particular 
selection out of an ensemble of possible noise waveforms. 


The assumption that y(t) = x(t) + z(t) implies that the channel attenuation is known and 
removed by scaling the received signal and noise. It also implies that the input is not filtered or 
distorted by the channel. Finally it implies that the delay and carrier phase between input and 
output is known and removed at the receiver. 


The noise should be modeled probabilistically. This is partly because the noise is a priori 
unknown, but can be expected to behave in statistically predictable ways. It is also because 
encoders and decoders are designed to operate successfully on a variety of different channels, all 
of which are subject to different noise waveforms. The noise is usually modeled as zero mean, 
since a mean can be trivially removed. 


Modeling the waveforms x(t) and z(t) probabilistically will take considerable care. If x(t) and 
z(t) were defined only at discrete values of time, such as {t = kT; k € Z}, then they could 
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be modeled as sample values of sequences of random variables (rv’s). These sequences of rv’s 
could then be denoted as X(t) = {X(kT);k € Z} and Z(t) = {Z(kT); k € Z}. The case of 
interest here, however, is where x(t) and z(t) are defined over the continuum of values of t, and 
thus a continuum of rv’s is required. Such a probabilistic model is known as a random process 
or, synonomously, a stochastic process. These models behave somewhat similarly to random 
sequences, but they behave differently in a myriad of small but important ways. 


7.2 Random processes 


A random process {Z(t); t € R} is a collection! of rv’s, one for each t € R. The parameter t 
usually models time, and any given instant in time is often referred to as an epoch. Thus there 
is one rv for each epoch. Sometimes the range of t is restricted to some finite interval, [a, b}, 
and then the process is denoted as { Z(t); t € [a, b]}. 


There must be an underlying sample space 2. over which these rv’s are defined. That is, for 
each epoch t € R (or t € [a, 0]), the rv Z(t) is a function {Z(t,w); w£Q} mapping sample points 
w € 2 to real numbers. 


A given sample point w € 2 within the underlying sample space determines the sample values 
of Z(t) for each epoch t. The collection of all these sample values for a given sample point wu, 
i.e., {Z(t,w); t € R} is called a sample function {z(t) : R — R} of the process. 


Thus Z(t,w) can be viewed as a function of w for fixed t, in which case it is the rv Z(t), 
or it can be viewed as a function of t for fixed w, in which case it is the sample function 
{z(t): R — R} = {Z(t,.); t € R} corresponding to the given w. Viewed as a function of both 
t and ow, {Z(t,w);t € R,w € OQ} is the random process itself; the sample point w is usually 
suppressed, denoting the process as {Z(t); t € R} 


Suppose a random process { Z(t); t € R} models the channel noise and {z(t) : R — R} isa sample 
function of this process. At first this seems inconsistent with the traditional elementary view 
that a random process or set of rv’s models an experimental situation a priori (before performing 
the experiment) and the sample function models the result a posteriori (after performing the 
experiment). The trouble here is that the experiment might run from t = —oo to t = o, so 
there can be no “before” for the experiment and “after” for the result. 


There are two ways out of this perceived inconsistency. First, the notion of ‘before and after’ 
in the elementary view is inessential; the only important thing is the view that a multiplicity of 
sample functions might occur, but only one actually occurs. This point of view is appropriate in 
designing a cellular telephone for manufacture. Each individual phone that is sold experiences 
its own noise waveform, but the device must be manufactured to work over the multiplicity of 
such waveforms. 


Second, whether we view a function of time as going from —oo to +co or going from some 
large negative to large positive time is a matter of mathematical convenience. We often model 
waveforms as persisting from —oo to +o0, but this simply indicates a situation in which the 
starting time and ending time are sufficiently distant to be irrelevant. 


‘Since a random variable is a mapping from 2 to R, the sample values of a rv are real and thus the sample 
functions of a random process are real. It is often important to define objects called complex random variables 
that map 2 to C. One can then define a complex random process as a process that maps each t € R into a 
complex random variable. These complex random processes will be important in studying noise waveforms at 
baseband. 
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In order to specify a random process {Z(t);¢ € R}, some kind of rule is required from which joint 
distribution functions can, at least in principle, be calculated. That is, for all positive integers 


n, and all choices of n epochs ¢1,t2,...,tn, it must be possible (in principle) to find the joint 
distribution function, 

PGi... Bg) Fie 520) = PH ZG) Ses 52 Ea) S Bus (ral) 
for all choices of the real numbers 21,... , Zn. Equivalently, if densities exist, it must be possible 


(in principle) to find the joint density, 


OP P'(41),... Z (tn) (219 ++ 4 2n) 
fz(t1),... Z (tn) (21 re sa) = ae c — ’ (7.2) 


for all real z1,... ,Z,. Since n can be arbitrarily large in (7.1) and (7.2), it might seem difficult 
for a simple rule to specify all these quantities, but a number of simple rules are given in the 
following examples that specify all these quantities. 


7.2.1. Examples of random processes 


The following generic example will turn out to be both useful and quite general. We saw earlier 
that we could specify waveforms by the sequence of coefficients in an orthonormal expansion. 
In the following example, a random process is similarly specified by a sequence of rv’s used as 
coefficients in an orthonormal expansion. 


Example 7.2.1. Let 2), Z,..., be a sequence of rv’s defined on some sample space 2 and 
let {¢1(t)}, {¢2(t)},... , be a sequence of orthogonal (or orthonormal) real functions. For each 
t € R, let the rv Z(t) be defined as Z(t) = )°, Z,¢;(t). The corresponding random process 
is then {Z(t); t € R}. For each t, Z(t) is simply a sum of rv’s, so we could, in principle, find 
its distribution function. Similarly, for each n-tuple, t1,... ,tn of epochs, Z(t,),...,Z(tn) is an 
n-tuple of rv’s whose joint distribution could in principle be found. Since Z(t) is a countably 
infinite sum of rv’s, \°P°., Zp¢x(t), there might be some mathematical intricacies in finding, or 
even defining, its distribution function. Fortunately, as will be seen, such intricacies do not arise 
in the processes of most interest here. 


It is clear that random processes can be defined as in the above example, but it is less clear 
that this will provide a mechanism for constructing reasonable models of actual physical noise 
processes. For the case of Gaussian processes, which will be defined shortly, this class of models 
will be shown to be broad enough to provide a flexible set of noise models. 


The next few examples specialize the above example in various ways. 


Example 7.2.2. Consider binary PAM, but view the input signals as independent identically 
distributed (iid) rv’s U1, U2,... , which take on the values +1 with probability 1/2 each. Assume 


that the modulation pulse is sinc( +) so the baseband random process is 


U(t) = > Oh aie € =) . 


At each sampling epoch kT, the rv U(kT) is simply the binary rv Uz. At epochs between the 
sampling epochs, however, U(t) is a countably infinite sum of binary rv’s whose variance will 
later be shown to be 1, but whose distribution function is quite ugly and not of great interest. 
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Example 7.2.3. A random variable is said to be zero-mean Gaussian if it has the probability 


density 
2 
—Zz 
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falz) = [5-3], (7.3) 


1 
V 210? ve 


where o? is the variance of Z. A common model for a noise process {Z(t);t € R} arises by 
letting 


Z(t) = d Zp sinc € =) ; (7.4) 


where ... ,Z_1,Zo,Zj,... , is a sequence of iid zero-mean Gaussian rv’s of variance o?. At 


each sampling epoch kT, the rv Z(kT) is the zero-mean Gaussian rv Z,. At epochs between 
the sampling epochs, Z(t) is a countably infinite sum of independent zero-mean Gaussian rv’s, 
which turns out to be itself zero-mean Gaussian of variance 0”. The next section considers sums 
of Gaussian rv’s and their inter-relations in detail. The sample functions of this random process 
are simply sinc expansions and are limited to the baseband [—1/(2T), 1/(2T)]. This example, as 
well as the previous example, brings out the following mathematical issue: the expected energy 
in {Z(t);t € R} turns out to be infinite. As discussed later, this energy can be made finite either 
by truncating Z(t) to some finite interval much larger than any time of interest or by similarly 
truncating the sequence {Z;,;k € Z}. 


Another slightly disturbing aspect of this example is that this process cannot be ‘generated’ 
by a sequence of Gaussian rv’s entering a generating device that multiplies them by T-spaced 
sinc functions and adds them. The problem is the same as the problem with sinc functions in 
the previous chapter - they extend forever and thus the process cannot be generated with finite 
delay. This is not of concern here, since we are not trying to generate random processes, only to 
show that interesting processes can be defined. The approach here will be to define and analyze 
a wide variety of random processes, and then to see which are useful in modeling physical noise 
processes. 


Example 7.2.4. Let {Z(t); t € [-1, 1]} be defined by Z(t) = tZ for all t € [—1, 1] where Z 
is a zero-mean Gaussian rv of variance 1. This example shows that random processes can be 
very degenerate; a sample function of this process is fully specified by the sample value z(t) at 
t = 1. The sample functions are simply straight lines through the origin with random slope. 
This illustrates that the sample functions of a random process do not necessarily “look” random. 


7.2.2. The mean and covariance of a random process 


Often the first thing of interest about a random process is the mean at each epoch t and 
the covariance between any two epochs t,t. The mean, E[Z(t)] = Z(t) is simply a real valued 
function of t and can be found directly from the distribution function F’zq)(z) or density fz(¢)(z). 
It can be verified that Z(t) is 0 for all ¢ for Examples 7.2.2, 7.2.3, and 7.2.4 above. For Example 
7.2.1, the mean can not be specified without specifying more about the random sequence and 
the orthogonal functions. 


2 


The covariance is a real-valued function of the epochs t and 7. It is denoted by Kz(t,7) and 


?This is often called the autocovariance to distinguish it from the covariance between two processes; we will 
not need to refer to this latter type of covariance. 
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defined by 

Kz(t,7) = E [[Z(t) — Z()][Z(7) — Z(7)]] - (7.5) 
This can be calculated (in principle) from the joint distribution function Fz(¢) 7(7)(21, 22) or from 
the density fz(¢),z(7)(21, 22). To make the covariance function look a little simpler, we usually 


split each random variable Z(t) into its mean, Z(t), and its fluctuation, Z(t) = Z(t) Z(t). The 
covariance function is then 


Kz(t,7) =E Z()2(r)| (7.6) 


The random processes of most interest to us are used to model noise waveforms and usually 
have zero mean, in which case Z(t) = Z(t). In other cases, it often aids intuition to separate 
the process into its mean (which is simply an ordinary function) and its fluctuation, which is by 
definition zero mean. 


The covariance function for the generic random process in Example 7.2.1 above can be written 


as 
Kz(t, T) =E S> Zp.r(t) S- Zmbm(T) : (7.7) 
k m 
If we assume that the rv’s Z1, Zo,... are iid with variance é, then E[Z,Zm| = 0 fork Am and 
E[Z.Zm| = 0? for k = m. Thus, ignoring convergence questions, (7.7) simplifies to 


Kz(t,7) = 07S” dx(t)ox(7). (7.8) 
k 


For the sampling expansion, where ¢;(t) = sinc(# — k), it can be shown (see (7.48)) that the 


t—T 


sum in (7.8) is simply sinc(7*). Thus for Examples 7.2.2 and 7.2.3, the covariance is given by 


KG eae (: - 7) 


where o? = 1 for the binary PAM case of Example 7.2.2. Note that this covariance depends 
only on ¢ —7 and not on the relationship between t or 7 and the sampling points kT’. These 
sampling processes are considered in more detail later. 


7.2.3 Additive noise channels 


The communication channels of greatest interest to us are known as additive noise channels. 
Both the channel input and the noise are modeled as random processes, { X(t); ¢ € R} and 
{Z(t); t © R}, both on the same underlying sample space 2. The channel output is another 
random process {Y(t); t € R} and Y(t) = X(t) + Z(t). This means that for each epoch t the 
random variable Y(t) is equal to X(t) + Z(t). 


Note that one could always define the noise on a channel as the difference Y(t) — X(t) between 
output and input. The notion of additive noise inherently also includes the assumption that the 
processes { X(t); t € R} and {Z(t); t € R} are statistically independent.? 


3More specifically, this means that for all k > 0, all epochs t1,...,tx, and all epochs 71,...,7%, the rvs 
X(t1),...,X(tx) are statistically independent of Z(71),... ,Z(7x). 
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As discussed earlier, the additive noise model Y(t) = X(t) + Z(t) implicitly assumes that the 
channel attenuation, propagation delay, and carrier frequency and phase are perfectly known and 
compensated for. It also assumes that the input waveform is not changed by any disturbances 
other than the noise, Z(t). 


Additive noise is most frequently modeled as a Gaussian process, as discussed in the next section. 
Even when the noise is not modeled as Gaussian, it is often modeled as some modification of 
a Gaussian process. Many rules of thumb in engineering and statistics about noise are stated 
without any mention of Gaussian processes, but are often valid only for Gaussian processes. 


7.3 Gaussian random variables, vectors, and processes 


This section first defines Gaussian random variables (rv’s), then jointly-Gaussian random vec- 
tors (rv’s), and finally Gaussian random processes. The covariance function and joint density 
function for Gaussian random vectors are then derived. Finally several equivalent conditions for 
rv’s to be jointly Gaussian are derived. 


A rv W is a normalized Gaussian rv, or more briefly a normal‘ rv, if it has the probability 
density 


1 mr 
fy (w) = Jon | : 
This density is symmetric around 0 and thus the mean of W is zero. The variance is 1, which is 
probably familiar from elementary probability and is demonstrated in Exercise 7.1. A random 
variable Z is a Gaussian rv if it is a scaled and shifted version of a normal rv, i.e., if Z = aW+Z 
for a normal rv W. It can be seen that Z is the mean of Z and o? is the variance’. The density 
of Z (for a? > 0) is 


f,() = 


1 a (7.9) 


Vinc2 | (20?) 


A Gaussian rv Z of mean Z and variance o? is denoted as Z ~ N(Z,o7). The Gaussian rv’s 
used to represent noise are almost invariably zero-mean. Such rv’s have the density fz(z) = 


ao exp[52r| and are denoted by Z ~ N(0,07). 


Zero-mean Gaussian rv’s are important in modeling noise and other random phenomena for the 


following reasons: 


e They serve as good approximations to the sum of many independent zero-mean rv’s (recall 
the central limit theorem). 


e They have a number of extremal properties; as discussed later, they are, in several senses, 
the most random rv’s for a given variance. 


e They are easy to manipulate analytically, given a few simple properties. 


e They serve as common channel noise models, and in fact the literature often assumes that 
noise is modeled by zero-mean Gaussian rv’s without explicitly stating it. 


4Some people use normal rv as a synonym for Gaussian rv. 
*Tt is convenient to denote Z as Gaussian even in the deterministic case where o = 0, but (7.9) is invalid then. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


7.3. GAUSSIAN RANDOM VARIABLES, VECTORS, AND PROCESSES 205 


Definition 7.3.1. A set of n of random variables, Z71,..., Zp is zero-mean jointly Gaussian if 
there is a set of iid normal rv’s Wj,... , We such that each Z;,, 1 < k <n, can be expressed as 


L 
Zh = S> QkmWm; 1l<k<n, (7.10) 
m=. 


where {@pm3; 1<k<n, 1<m<¢} is an array of real numbers. Z},... ,Z), is jointly Gaussian if 


Z\, = Zy + Zi, where the set Z1,... , Zp is zero-mean jointly Gaussian and Z{,... , Z/, is a set of 
real numbers. 


It is convenient notationally to refer to a set of m random variables, Z1,...,Z), as a ran- 
dom vector® (rv) Z = (Zj,...,Zn)'. Letting A be the n by @ real matrix with elements 
{apm; 1<k<n, 1<m<#}, (7.10) can then be represented more compactly as 


Z=AW. (710) 


Similarly the jointly-Gaussian random vector Z’ above can be represented as Z’ = AZ + Z' 
where Z’ is an n-vector of real numbers. 


In the remainder of this chapter, all random variables, random vectors, and random processes 
are assumed to be zero-mean unless explicitly designated otherwise. Viewed differently, only the 
fluctuations are analyzed with the means added at the end’. 


It is shown in Exercise 7.2 that any sum )°,,, @kmWym of iid normal rv’s Wj,... , W, is a Gaussian 
rv, so that each Z, in (7.10) is Gaussian. Jointly Gaussian means much more than this, however. 
The random variables Z1,... , Z, must also be related as linear combinations of the same set of 
iid normal variables. Exercises 7.3 and 7.4 illustrate some examples of pairs of random variables 
which are individually Gaussian but not jointly Gaussian. These examples are slightly artificial, 
but illustrate clearly that the joint density of jointly-Gaussian rv’s is much more constrained 
than the possible joint densities arising from constraining marginal distributions to be Gaussian. 


The above definition of jointly Gaussian looks a little contrived at first, but is in fact very natural. 
Gaussian rv’s often make excellent models for physical noise processes because noise is often the 
summation of many small effects. The central limit theorem is a mathematically precise way of 
saying that the sum of a very large number of independent small zero-mean random variables 
is approximately zero-mean Gaussian. Even when different sums are statistically dependent on 
each other, they are different linear combinations of a common set of independent small random 
variables. Thus the jointly-Gaussian assumption is closely linked to the assumption that the 
noise is the sum of a large number of small, essentially independent, random disturbances. 
Assuming that the underlying variables are Gaussian simply makes the model analytically clean 
and tractable. 


An important property of any jointly-Gaussian n-dimensional rv Z is the following: for any real 
m by n real matrix B, the rv Y = BZ is also jointly Gaussian. To see this, let Z = AW where 
W is anormal rv. Then 


Y =BZ =B(AW) =(BA)W. (7.12) 


®The class of random vectors for a given n over a given sample space satisfies the axioms of a vector space, 
but here the vector notation is used simpy as a notational convenience. 

"When studying estimation and conditional probabilities, means become an integral part of many arguments, 
but these arguments will not be central here. 
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Since BA is a real matrix, Y is jointly Gaussian. A useful application of this property arises 
when A is diagonal, so Z has arbitrary independent Gaussian components. This implies that 
Y = BZ is jointly Gaussian whenever a rv Z has independent Gaussian components. 


Another important application is where B is a 1 by n matrix and Y is a random variable. Thus 
every linear combination )7/_, b,Z, of a jointly-Gaussian rv Z = (Z1,... , Zn)" is Gaussian. It 
will be shown later in this section that this is an if and only if property; that is, if every linear 
combination of arv Z is Gaussian, then Z is jointly Gaussian. 


We now have the machinery to define zero-mean Gaussian processes. 


Definition 7.3.2. {Z(t);t € R} is a zero-mean Gaussian process if, for all positive integers n 
and all finite sets of epochs t,... ,tn, the set of random variables Z(t1),... , Z(tn) is a (zero- 
mean) jointly-Gaussian set of random variables. 


If the covariance, Kz(t,7) = E[Z(t)Z(r)], is known for each pair of epochs t, 7, then for any 
finite set of epochs t1,... ,tn, E[Z(t,)Z(tm)] is known for each pair (t,,tm) in that set. The 
next two subsections will show that the joint probability density for any such set of (zero-mean) 
jointly-Gaussian rv’s depends only on the covariances of those variables. This will show that a 
zero-mean Gaussian process is specified by its covariance function. A nonzero-mean Gaussian 
process is similarly specified by its covariance function and its mean. 


7.3.1 The covariance matrix of a jointly-Gaussian random vector 


Let an n-tuple of (zero-mean) random variables (rv’s) Z1,... ,Z, be represented as a random 
vector (rv) Z = (Z,...,Z,)'. As defined in the previous section, Z is jointly Gaussian if 
Z=AW where W = (Wj, W2,... , We)" is a vector of iid normal rv’s and A is an n by £ real 
matrix. Each rv Z,, and all linear combinations of Z1,... , Z,, are Gaussian. 


The covariance of two (zero-mean) rv’s 21, Z is E[Z,Z2]. For arv Z = (Z,...Zp)' the 
covariance between all pairs of random variables is very conveniently represented by the n by n 
covariance matrix, 


Kz =E[ZZ"). 


Appendix 7A.1 develops a number of properties of covariance matrices (including the fact that 
they are identical to the class of nonnegative definite matrices). For a vector W = Wj,... ,Wz 
of independent normalized Gaussian rv’s, E[W;W,,] = 0 for 7 4 m and 1 for 7 = m. Thus 


Kw =E|_WW"] =|, 


where lz is the @ by @ identity matrix. For a zero-mean jointly-Gaussian vector Z = AW, the 
covariance matrix is thus 


Kz =E|AW W'A'|=AE[WW'A' = AA’. (7.13) 
The probability density, fz(z), of arv Z = (Z1,Z2,...,Z,)' is the joint probability density 
of the components Z1,...,Z,. An important example is the iid rv W where the components 


Wr, 1<k <n, are iid and normal, W, ~ N(0,1). By taking the product of the n densities of 
the individual random variables, the density of W = (W1, Wo,...,W,)" is 


1 wi — we — +++ — w2 = 1 —||w||? 
fw(w) = (nr exp ( 1 5 ) = OmeP exp ( 5 ) : (7.14) 
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This shows that the density of W at a sample value w depends only on the squared distance 
|| w||? of the sample value from the origin. That is, fw(w) is spherically symmetric around the 
origin, and points of equal probability density lie on concentric spheres around the origin. 


7.3.2 The probability density of a jointly-Gaussian random vector 


Consider the transformation Z = AW where Z and W each have n components and A is n by n. 
If we let a1, a2,... , @, be the n columns of A, then this means that Z = 5°, @mWm. That is, 
for any sample values w1,...w, for W, the corresponding sample value for Z is 2 = 0, @mWm- 
Similarly, if we let b,,..., 6, be the rows of A, then Z, = b, W. 


Let Bs be a cube, 6 on a side, of the sample values of W defined by Bs = {w : 0<w,<d; 1<k<n} 
(see Figure 7.1). The set Bi of vectors z = Aw for w € B; is a parallepiped whose sides are the 
vectors 0@1,...,0@,. The determinant, det(A), of A has the remarkable geometric property that 
its magnitude, | det(A)|, is equal to the volume of the parallelepiped with sides ax; 1 < k <n. 
Thus the unit cube Bs above, with volume 6”, is mapped by A into a parallelepiped of volume 


| det Als”. 
e 7 


i ae day ~ daz 


6 WI 0 21 


Figure 7.1: Example illustrating how Z = AW maps cubes into parallelepipeds. Let 
Zy = —W, + 2W2 and Z2 = W, + We. The figure shows the set of sample pairs 21, z2 
corresponding to 0 < w; < 6 and 0 < wa < 6. It also shows a translation of the same 
cube mapping into a translation of the same parallelepiped. 


Assume that the columns aj,...,@, are linearly independent. This means that the columns 
must form a basis for R”, and thus that every vector z is some linear combination of these 
columns, t.e., that z = Aw for some vector w. The matrix A must then be invertible, z.e., there 
is a matrix A~' such that AA~! = A~'A = |, where I, is the n by n identity matrix. The matrix 
A maps the unit vectors of R” into the vectors a1,... , @y, and the matrix A~! maps aj,... , @n 
back into the unit vectors. 


If the columns of A are not linearly independent, .e., A is not invertible, then A maps the unit 
cube in R” into a subspace of dimension less than n. In terms of Fig. 7.1, the unit cube would 
be mapped into a straight line segment. The area, in 2 dimensional space, of a straight line 
segment is 0, and more generally, the volume in n-space of a lower dimensional set of points is 
0. In terms of the determinant, det A = 0 for any noninvertible matrix A. 


Assuming again that A is invertible, let z be a sample value of Z, and let w = A~'z be the 
corresponding sample value of W. Consider the incremental cube w + B; cornered at w. For 6 
very small, the probability Ps(w) that W lies in this cube is fw(w)6” plus terms that go to zero 
faster than 6” as 6 — 0. This cube around w maps into a parallelepiped of volume 6”| det(A)| 
around z, and no other sample value of W maps into this parallelepiped. Thus P;(w) is also 
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equal to fz(z)6"|det(A)| plus negligible terms. Going to the limit 6 — 0, we have 


fz(z)|det(A)| = lim Pstw) = fw(w). (7.15) 
Since w = A~'z, we get the explicit formula 
_ fw(A'z) 
fz(z) = “Tdet(A)| | (7.16) 


This formula is valid for any random vector W with a density, but we are interested in the 
vector W of iid Gaussian random variables, (0,1). Substituting (7.14) into (7.16), 


- 1 —||A7*2 ||? 
I2(2) = Gamaeay oP (=) mary) 


1 T/n—-1\T a— 
(On)"72| det(A)| exp | 52 (A1)TA | (7.18) 


We can simplify this somewhat by recalling from (7.13) that the covariance matrix of Z is given 
by Kz = AA’. Thus KZ’ = (A“)"At. 
Substituting this into (7.18) and noting that det(Kz) = | det(A)|’, 


1 
(Qr)"/2,/ det(Kz) 


Note that this probability density depends only on the covariance matrix of Z and not directly 
on the matrix A. 


Pee ae | acd . (7.19) 


The above density relies on A being nonsingular. If A is singular, then at least one of its rows 
is a linear combination of the other rows, and thus, for some m, 1 < m <n, Zp is a linear 
combination of the other Z,. The random vector Z is still jointly Gaussian, but the joint 
probability density does not exist (unless one wishes to view the density of Z,,, as a unit impulse 
at a point specified by the sample values of the other variables). It is possible to write out 
the distribution function for this case, using step functions for the dependent rv’s, but it is not 
worth the notational mess. It is more straightforward to face the problem and find the density 
of a maximal set of linearly independent rv’s, and specify the others as deterministic linear 
combinations. 


It is important to understand that there is a large difference between rv’s being statistically 
dependent and linearly dependent. If they are linearly dependent, then one or more are deter- 
ministic functions of the others, whereas statistical dependence simply implies a probabilistic 
relationship. 


These results are summarized in the following theorem: 


Theorem 7.3.1 (Density for jointly-Gaussian rv’s). Let Z be a (zero-mean) jointly- 
Gaussian rv with a nonsingular covariance matrix Kz. Then the probability density fz(z) is 
given by (7.19). If Kz is singular, then fz(z) does not exist but the density in (7.19) can be 
applied to any set of linearly independent rv’s out of Z1,..., Zn. 


For a zero-mean Gaussian process Z(t), the covariance function Kz(t, 7) specifies E [Z(t,) Z(tm)| 
for arbitrary epochs ty, and t and thus specifies the covariance matrix for any finite set of epochs 
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ty,...,t,. From the above theorem, this also specifies the joint probability distribution for that 
set of epochs. Thus the covariance function specifies all joint probability distributions for all 
finite sets of epochs, and thus specifies the process in the sense® of Section 7.2. In summary, we 
have the following important theorem. 


Theorem 7.3.2 (Gaussian process). A zero-mean Gaussian process is specified by its covari- 
ance function K(t,7). 


7.3.3 Special case of a 2-dimensional zero-mean Gaussian random vector 


The probability density in (7.19) is now written out in detail for the 2-dimensional case. Let 
E[Z?] = of, E[Z2] = 02 and E[Z,Z.] = ki2. Thus 


2 
OF. ch 
K12 095 
Let p be the normalized covariance p = k12/(a102). Then det(Kz) = 0703 — Kt, = 0203(1—p"). 


Note that p must satisfy |p| <1, and |p| < 1 for Kz to be nonsingular. 


KZ = : | 03 2 1 | 1/ot Saale 


ofo3 —K2, | —Ki2 oF «| 1p? | -p/(o192) ~— 1/08 


: —z103 + 2z122K12 — 230? 
1a\¢) 262 =x ( —Soiaz at 
on Jatol a, Ga 
= 1 eee oa + 2p(21/01)(z2/02) — (22/02)? 


270102\/1— p? 2(1 — p*) 


Curves of equal probability density in the plane correspond to points where the argument of 
the exponential function in (7.20) is constant. This argument is quadratic and thus points of 
equal probability density form an ellipse centered on the origin. The ellipses corresponding to 
different values of probability density are concentric, with larger ellipses corresponding to smaller 
densities. 


) . (7.20) 


If the normalized covariance p is 0, the axes of the ellipse are the horizontal and vertical axes of 
the plane; if 0, = a9, the ellipse reduces to a circle, and otherwise the ellipse is elongated in the 
direction of the larger standard deviation. If p > 0, the density in the first and third quadrants 
is increased at the expense of the second and fourth, and thus the ellipses are elongated in the 
first and third quadrants. This is reversed, of course, for p < 0. 


The main point to be learned from this example, however, is that the detailed expression for 
2 dimensions in (7.20) is messy. The messiness gets far worse in higher dimensions. Vector 
notation is almost essential. One should reason directly from the vector equations and use 
standard computer programs for calculations. 


5 As will be discussed later, focusing on the pointwise behavior of a random process at all finite sets of epochs 
has some of the same problems as specifying a function pointwise rather than in terms of £2 equivalence. This 
can be ignored for the present. 
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7.3.4 Z=AW where A is orthogonal 


An n by n real matrix A for which AA' = |, is called an orthogonal matrix or orthonormal 
matrix (orthonormal is more appropriate, but orthogonal is more common). For Z = AW, 
where W is iid normal and A is orthogonal, Kz = AA' = I,,. Thus Kz = |, also and (7.19) 
becomes 


etn exp [—$z z| 7 II exp[=%/2] (7.21) 


This means that A transforms W into a random vector Z with the same probability density, 
and thus the components of Z are still normal and iid. To understand this better, note that 
AAT = |, means that AT is the inverse of A and thus that ATA = |,,. Letting a be the m* 
column of A, the equation A'A = |, means that a},a; = 6m; for each m, j, 1<m, j<n, ie., that 
the columns of A are orthonormal. Thus, for the two dimensional example, the unit vectors 
€1, €2 are mapped into orthonormal vectors a1, a2, so that the transformation simply rotates 
the points in the plane. Although it is difficult to visualize such a transformation in higher 
dimensional space, it is still called a rotation, and has the property that ||Aw||? = w'ATAw, 
which is just w'w = ||w||?. Thus, each point w maps into a point Aw at the same distance 
from the origin as itself. 


Not only the columns of an orthogonal matrix are orthonormal, but the rows, say {b;; 1<k<n} 
are also orthonormal (as is seen directly from AA’ = I|,,). Since Z, = b;, W, this means that, for 
any set of orthonormal vectors 61,...,6,, the random variables Z, = 6, W are normal and iid 
forl<k<n. 


7.3.5 Probability density for Gaussian vectors in terms of principal axes 


This subsection describes what is often a more convenient representation for the probability 
density of an n-dimensional (zero-mean) Gaussian rv Z with a nonsingular covariance matrix 
Kz. As shown in Appendix 7A.1, the matrix Kz has n real orthonormal eigenvectors, q1,.-- , Qn; 
with corresponding nonnegative (but not necessarily distinct®) real eigenvalues, \1,... , An. Also, 
for any vector z, it is shown that z™Kz'z can be expressed as >, A; '|(z, qx) |?. Substituting 
this in (7.19), we have 


1 (Zz, Qk) | 
z)= y , 7.22 
Faz) (27)”/2,/det Ce oo |-¥ 2Af en 
Note that (z, q,) is the projection of z on the kth of n orthonormal directions. The determinant 


of an n by n real matrix can be expressed in terms of the n eigenvalues (see Appendix 7A.1) as 
det(Kz) = [1 Ax. Thus (7.22) becomes 


= = 1 ex —|(z, ax) I? 
= laa P| i (7.23) 


°Tf an eigenvalue \ has multiplicity m, it means that there is an m dimensional subspace of vectors q satisfying 
Kzq = Aq; in this case any orthonormal set of m such vectors can be chosen as the m eigenvectors corresponding 
to that eigenvalue. 
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This is the product of n Gaussian densities. It can be interpreted as saying that the Gaussian 
random variables {(Z, qx); 1 < k < n} are statistically independent with variances {Az;1 < k < 
n}. In other words, if we represent the rv Z using qi,... , @n as a basis, then the components of 
Z in that coordinate system are independent random variables. The orthonormal eigenvectors 
are called principal axes for Z. 


This result can be viewed in terms of the contours of equal probability density for Z (see Figure 
7.2). Each such contour satisfies 


[(Z, ae) IF 2, Qk) | 
c= 
Doe 
where c is proportional to the log probability density for that contour. This is the equation of 


an ellipsoid centered on the origin, where q, is the kth axis of the ellipsoid and 2c, is the 
length of that axis. 


V1 41 
V A242 


Figure 7.2: Contours of equal probability density. Points z on the q; axis are points 
for which (z,q2) = 0 and points on the q2 axis satisfy (z,q1) = 0. Points on the 
illustrated ellipse satisfy ZK, 2 = 1; 


The probability density formulas in (7.19) and (7.23) suggest that for every covariance matrix 
K, there is a jointly Gaussian rv that has that covariance, and thus has that probability density. 
This is in fact true, but to verify it, we must demonstrate that for every covariance matrix 
K, there is a matrix A (and thus arv Z = AW) such that K = AA‘. There are many such 
matrices for any given K, but a particularly convenient one is given in (7.88). As a function 
of the eigenvectors and eigenvalues of K, it is A = >>, VAr@xq,- Thus, for every nonsingular 
covariance matrix, K, there is a jointly Gaussian rv whose density satisfies (7.19) and (7.23) 


7.3.6 Fourier transforms for joint densities 


As suggested in Exercise 7.2, Fourier transforms of probability densities are useful for finding 
the probability density of sums of independent random variables. More generally, for an n- 
dimensional rv, Z, we can define the n-dimensional Fourier transform of fz(z) as 


fa(s) = a vee if fa(z) exp(—27is"z) dz ---dzp, = E[exp(—27is'Z)]. (7.24) 
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If Z is jointly Gaussian, this is easy to calculate. For any given s # O, let X = s'Z = S°, 8,Zy. 
Thus X is Gaussian with variance E[s'Z Z's] = s'Kzs. From Exercise 7.2, 


270)?8™K 
fx (0) = Elexp(—27i68"Z)] = exp sd (7.25) 
Comparing (7.25) for 6 = 1 with (7.24), we see that 
x ) 2 ™K 
fz(s) =exp ad (7.26) 


The above derivation also demonstrates that fa(s) is determined by the Fourier transform 
of each linear combination of the elements of Z. In other words, if an arbitrary rv Z has 
covariance Kz and has the property that all linear combinations of Z are Gaussian, then the 
Fourier transform of its density is given by (7.26). Thus, assuming that the Fourier transform of 
the density uniquely specifies the density, Z must be jointly Gaussian if all linear combinations 
of Z are Gaussian. 


A number of equivalent conditions have now been derived under which a (zero-mean) random 
vector Z is jointly Gaussian. In summary, each of the following are necessary and sufficient 
conditions for arv Z with a nonsingular covariance Kz to be jointly Gaussian. 


e Z=AW where the components of W are iid normal and Kz = AA’; 
e Z has the joint probability density given in (7.19); 

e Z has the joint probability density given in (7.23); 

e All linear combinations of Z are Gaussian random variables. 


For the case where Kz is singular, the above conditions are necessary and sufficient for any 
linearly independent subset of the components of Z. 

This section has considered only zero-mean random variables, vectors, and processes. The results 
here apply directly to the fluctuation of arbitrary random variables, vectors, and processes. In 
particular the probability density for a jointly Gaussian rv Z with a nonsingular covariance 
matrix Kz and mean vector Z is 


ae? (2 B)"kz'(2-2)|. (7.27) 


f2(z) = 


7.4 Linear functionals and filters for random processes 


This section defines the important concept of linear functionals on arbitrary random processes 
{Z(t);t © R} and then specializes to Gaussian random processes, where the results of the 
previous section can be used. Assume that the sample functions Z(t,w) of Z(t) are real Lo 
waveforms. These sample functions can be viewed as vectors over R in the £2 space of real 
waveforms. For any given real £2 waveform g(t), there is an inner product, 


(oe) 


(Z(t), 9) =f Z(tyu)olt) at 


—co 


By the Schwarz inequality, the magnitude of this inner product in the space of real £2 functions 
is upper bounded by ||Z(t,~)||||g()|| and is thus a finite real value for each w. This then maps 
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sample points w into real numbers and is thus a random variable,'° denoted V = [™. Z(t)g(t) dt. 
This random variable V is called a linear functional of the process {Z(t);t € R}. 


As an example of the importance of linear functionals, recall that the demodulator for both PAM 
and QAM contains a filter q(t) followed by a sampler. The output of the filter at a sampling 
time kT for an input u(t) is f u(t)g(kT — t) dt. If the filter input also contains additive noise 
Z(t), then the output at time kT also contains the linear functional [ Z(t)q(kT — t) dt. 


Similarly, for any random process {Z(t);t € R} (again assuming £2 sample functions) and 
any real £2 function A(t), consider the result of passing Z(t) through the filter with impulse 
response h(t). For any £2 sample function Z(t,w), the filter output at any given time 7 is the 
inner product 


(oe) 


(Z(t,w), h(7 —t)) = / Z(t,w)h(7 — t) dt. 
—Co 
For each real 7, this maps sample points w into real numbers and thus (aside from measure 
theoretic issues), 


Vin= i Z(t)h(r — t) dt (7.28) 


is arv for each Tr. This means that {V(7);7 € R} is a random process. This is called the filtered 
process resulting from passing Z(t) through the filter h(t). Not much can be said about this 
general problem without developing a great deal of mathematics, so instead we restrict ourselves 
to Gaussian processes and other relatively simple examples. 


For a Gaussian process, we would hope that a linear functional is a Gaussian random variable. 
The following examples show that some restrictions are needed even on the class of Gaussian 
processes. 


Example 7.4.1. Let Z(t) = tX for all t € R where X ~ N(0,1). The sample functions of 
this Gaussian process have infinite energy with probability 1. The output of the filter also has 
infinite energy except except for very special choices of h(t). 


Example 7.4.2. For each t € [0, 1], let W(t) be a Gaussian rv, W(t) ~ A/(0,1). Assume 
also that E[W(t)W(7)]| = 0 for each t £ 7 € [0,1]. The sample functions of this process 
are discontinuous everywhere!!. We do not have the machinery to decide whether the sample 
functions are integrable, let alone whether the linear functionals above exist; we come back later 
to further discuss this example. 


In order to avoid the mathematical issues in Example 7.4.2 above, along with a host of other 
mathematical issues, we start with Gaussian processes defined in terms of orthonormal expan- 
sions. 


'Qne should use measure theory over the sample space 2 to interpret these mappings carefully, but this is 
unnecessary for the simple types of situations here and would take us too far afield. 

'l Even worse, the sample functions are not measurable. This process would not even be called a random process 
in a measure theoretic formulation, but it provides an interesting example of the occasional need for a measure 
theoretic formulation. 
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7.4.1 Gaussian processes defined over orthonormal expansions 


Let {¢x(t); k > 1} be a countable set of real orthonormal functions and let {Z,; k > 1} bea 
sequence of independent Gaussian random variables, {V(0,02)}. Consider the Gaussian process 
defined by 


= 3 Zyb(t). (7.29) 
k=1 


Essentially all zero-mean Gaussian processes of interest can be defined this way, although we will 
not prove this. Clearly a mean can be added if desired, but zero-mean processes are assumed in 
what follows. First consider the simple case in which oe is nonzero for only finitely many values 
of k, say 1 <k <n. In this case, Z(t), for each t € R, is a finite sum, 


=> Zpdx(t), (7.30) 


of independent Gaussian rv’s and thus is Gaussian. It is also clear that Z(t1), Z(t2),...Z(tz) are 
jointly Gaussian for all @,t1,... , te, so {Z(t);t © R} is in fact a Gaussian random process. The 
energy in any sample function, z(t) = >>, ze@x(t) is OR_, 2%. This is finite (since the sample 
values are real and thus finite), so every sample function is £2. The covariance function is then 
easily calculated to be 


Kz(t,7) = >) E[ZeZmlbx -Yiet bn(t) op (T (7.31) 


km 


Next consider the linear functional | Z(t)g(t) dt where g(t) is a real £2 function, 


v= / _A(t)g(t) at = os Zp i __ x(t at (7.32) 


Since V is a weighted sum of the zero-mean independent Gaussian rv’s Z1,...,2Zn, V is also 
Gaussian with variance 


E[V7] = S- of \ (by, 9)?- (7.33) 
k=1 


Next consider the case where n is infinite but 57, 07 < oo. The sample functions are still L2 (at 
least with probability 1). Equations (7.29), (7.30), (7.31), (7.32) and (7.33) are still valid, and 
Z is still a Gaussian rv. We do not have the machinery to easily prove this, although Exercise 
7.7 provides quite a bit of insight into why these results are true. 

Finally, consider a finite set of £2 waveforms {gm(t); 1 <m < ¢}. Let Vn = f°. Z(t)gm(t) dt. 
By the same argument as above, V,,, is a Gaussian rv for each m. Furthermore, since each linear 
combination of these variables is also a linear functional, it is also Gaussian, so {Vi,... , Ve} is 
jointly Gaussian. 
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7.4.2 Linear filtering of Gaussian processes 


We can use the same argument as in the previous subsection to look at the output of a linear 
filter for which the input is a Gaussian process {Z(t);t € R}. In particular, assume that 
Z(t) = 0, Zebx(t) where Z,,Z,... is an independent sequence {Z, ~ N(0,07} satisfying 
>, 7% < c& and where ¢1(t), d2(t),... , is a sequence of orthonormal functions. 


{Z(t); teR} ——y h(t) H+ {V(7); TER} 


; Figure 7.3: Filtered random Process ; 
Assume that the impulse response h(t) of the filter is a real £2 waveform. Then for any given 


sample function Z(t, w) = >>, Ze ()dz(t) of the input, the filter output at any epoch 7 is given 
by 


V(7,0) = [- Z(t,w)h(7 — t) dt = S> Zu(w) a op(t)h(r — t) dt. (7.34) 
Lae rs —0o 


Each integral on the right side of (7.34) is an £2 function of 7 whose energy is upper bounded 
by ||h||? (see Exercise 7.5). It follows from this (see Exercise 7.7) that [°° Z(t,w)h(7 — t) dt is 
an £2 waveform with probability 1. For any given epoch 7, (7.34) maps sample points w to real 
values and thus V(7,w) is a sample value of a random variable V(r). 


V(r) = / - Z(t)h(r—t) dt = > Z, * ob (t)h(r — t) dt. (7.35) 
—oo k —oo 


This is a Gaussian rv for each epoch 7. For any set of epochs, 7),...,7¢, we see that 
V(m1),--.,V (7) are jointly Gaussian. Thus {V(7);7 € R} is a Gaussian random process. 


We summarize the last two subsections in the following theorem. 


Theorem 7.4.1. Let {Z(t);t € R} be a Gaussian process, Z(t) = >>, Zebp(t), where {Z,;k > 
1} is a sequence of independent Gaussian rv’s N(0,02) where S> a2 < co and {dx(t); k > 1} is 
an orthonormal set. Then 
e For any set of Lo waveforms gi(t),... , ge(t), the linear functionals {Zm; 1<m < £} given 
by Zm = jae Z(t)gm(t) dt are zero-mean jointly Gaussian. 


e For any filter with real Ly impulse response h(t), the filter output {V(r);7 € R} given by 
(7.35) is a zero-mean Gaussian process. 


These are important results. The first, concerning sets of linear functionals, is important when 
we represent the input to the channel in terms of an orthonormal expansion; the noise can then 
often be expanded in the same orthonormal expansion. The second, concerning linear filtering, 
shows that when the received signal and noise are passed through a linear filter, the noise at the 
filter output is simply another zero-mean Gaussian process. This theorem is often summarized 
by saying that linear operations preserve Gaussianity. 


7.4.3 Covariance for linear functionals and filters 


Assume that {Z(t);¢ © R} is a random process and that gi(t),... ,ge(t) are real Lo waveforms. 
We have seen that if {Z(t);t € R} is Gaussian, then the linear functionals Vi,... , Ve given by 
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Vin = J, Z(t)9m(t) dt are jointly Gaussian for 1 <m < ¢. We now want to find the covariance 
for each pair V;,Vim of these random variables. The result does not depend on the process 
Z(t) being Gaussian. The computation is quite simple, although we omit questions of limits, 
interchanges of order of expectation and integration, etc. A more careful derivation could be 
made by returning to the sampling theorem arguments before, but this would somewhat obscure 
the ideas. Assuming that the process Z(t) is zero mean, 


(oe) 


~ ot [- ee [Z(t)Z(7)]gm(7) dt dr (7.37) 
= [. [- eee (t)Kz(t, 7) Gm(7) dt dr. (7.38) 


E[VjVnl = El Z(t)g;(t) dt eet Z(r)am(7) ar (7.36) 


Each covariance term (including E[V,?] for each m) then depends only on the covariance function 
of the process and the set of waveforms {gm;1 <m < @}. 


The convolution V(r) = { Z(t)h(r —t) dt is a linear functional at each time 1, so the covariance 
for the filtered ne of {Z(t : : € a follows in the same way as the results above. The output 
{V(r)} for a filter with a real £2 impulse response h is given by (7.35), so the covariance of the 
output can be found as 


Ky(r, s) 


E[V(r)V(s)] 
E [f Z(yh(r—tat | Z(7)h(s—7)dr 


fe i h(r—t)Kz(t, 7)h(s—r)dtdr. (7.39) 


7.5 Stationarity and related concepts 


Many of the most useful random processes have the property that the location of the time origin 
is irrelevant, i.e., they “behave” the same way at one time as at any other time. This property 
is called stationarity and such a process is called a stationary process. 


Since the location of the time origin must be irrelevant for stationarity, random processes that 
are defined over any interval other than (—oo, oo) cannot be stationary. Thus assume a process 
that is defined over (—co, 00). 


The next requirement for a random process {Z(t);t € R} to be stationary is that Z(t) must 
be identically distributed for all epochs t € R. This means that, for any epochs t and t +7, 
and for any real number x, Pr{Z(t) < x} = Pr{Z(t+7) < x}. This does not mean that Z(t) 
and Z(t +7) are the same random variables; for a given sample outcome w of the experiment, 
Z(t,w) is typically unequal to Z(t+7,w). It simply means that Z(t) and Z(t+7) have the same 
distribution function, 7.e., 


Fzay(x) = Fz¢e47)(2) for all x. (7.40) 


This is still not enough for stationarity, however. The joint distributions over any set of epochs 
must remain the same if all those epochs are shifted to new epochs by an arbitrary shift 7. This 
includes the previous requirement as a special case, so we have the definition: 
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Definition 7.5.1. A random process {Z(t); t€R} is stationary if, for all positive integers ¢, for 
all sets of epochs t1,... ,t¢ € R, for all amplitudes z1,... ,z¢, and for all shifts 7 € R, 


F7(41),... 2 (te) (21 +++ 1 Zt) = Fa(ey4r),... ,Z(te+r) (21 +++ 20): (7.41) 


For the typical case where densities exist, this can be rewritten as 


Joti, a(t, 71 7) DP sea nog at ek ha peD) (7.42) 


for all z1,...,z¢ ER. 


For a (zero-mean) Gaussian process, the joint distribution of Z(t1),... , Z(t) depends only on 
the covariance of those variables. Thus, this distribution will be the same as that of Z(ti+7), 
... ,Z(tet+7) if Kz(tm, tj) = Kz(tm+r, tj+7) for 1 < m,j < é. This condition will be satisfied for 
all 7, all @, and all t1,... ,te (verifying that {Z(t)} is stationary) if Kz(t1, t2) = Kz(ti+7, te+7) 
for all 7 and all t1,t2. This latter condition will be satisfied if Kz(ti,t2) = Kz(ti—te, 0) for all 
t,,t2g. We have thus shown that a zero-mean Gaussian process is stationary if 


K z(t1, t2) = Kz(ti—te, 0) for all tj,t2 ER. (7.43) 


Conversely, if (7.43) is not satisfied for some choice of t,,t2, then the joint distribution of 
Z(ti), Z(t2) must be different from that of Z(t1—tz2),Z(0), and the process is not stationary. 
The following theorem summarizes this. 


Theorem 7.5.1. A zero-mean Gaussian process {Z(t); t€R} is stationary if and only if (7.43) 
is satisfied. 


An obvious consequence of this is that a Gaussian process with a nonzero mean is stationary if 
and only if its mean is constant and its fluctuation satisfies (7.43). 


7.5.1 Wide-sense stationary (WSS) random processes 


There are many results in probability theory that depend only on the covariances of the random 
variables of interest (and also the mean if nonzero). For random processes, a number of these 
classical results are simplified for stationary processes, and these simplifications depend only on 
the mean and covariance of the process rather than full stationarity. This leads to the following 
definition: 


Definition 7.5.2. A random process {Z(t); t€R} is wide-sense stationary (WSS) if E[Z(t1)] = 
E|Z(0)| and Kz (t1, t2) = Kz(ti1—t2, 0) for all t1,t2 € R. 


Since the covariance function Kz(t+r7,t) of a WSS process is a function of only one variable 
T, we will often write the covariance function as a function of one variable, namely Kz(r) in 
place of Kz(t+7,t). In other words, the single variable in the single argument form represents 
the difference between the two arguments in two argument form. Thus for a WSS process, 
Kz(t,7) = Kz(t—T, 0) = Kz(t = 7): 

The random processes defined as expansions of T-spaced sinc functions have been discussed 
several times. In particular let 


V(t) = » V;, sinc (‘ *) (7.44) 
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where {... , V_1, Vo, Vi, ... } is a sequence of (zero-mean) iid rv’s. As shown in 7.8, the covariance 
function for this random process is 


ae, ‘ t—kT : 7 —kT 
Ky(t,7) =o Dene (= sinc i ; (7.45) 


where o?, is the variance of each V;. The sum in (7.45), as shown below, is a function only of 
t — 7, leading to the theorem: 


Theorem 7.5.2 (Sinc expansion). The random process in (7.44) is WSS. In addition, if the 
rv’s {Ve;k € Z} are tid Gaussian, the process is stationary. The covariance function is given by 


Kite ao2 ane () . (7.46) 


Proof: From the sampling theorem, any £2 function u(t), baseband limited to 1/(27), can be 
expanded as 


u(t) = Ss u(kT)sinc € =) (7.47) 


k 
For any given 7, take u(t) to be sinc(4*). Substituting this in (7.47), 


; t=7 ‘ hee Ve Ck : fies! ae t—kT 
sinc (=) = Dae ( F ) sinc (=) = Y "sine ( F ) sinc (=) . (7.48) 


k 


Substituting this in (7.45) shows that the process is WSS with the stated covariance. As shown 
in subsection 7.4.1, {V(t);t € R} is Gaussian if the rv’s {V,} are Gaussian. From Theorem 
7.5.1, this Gaussian process is stationary since it is WSS. 


Next consider another special case of the sinc expansion in which each V, is binary, taking values 
+1 with equal probability. This corresponds to a simple form of a PAM transmitted waveform. 
In this case, V(kT) must be +1, whereas for values of t between the sample points, V(t) can 
take on a wide range of values. Thus this process is WSS but cannot be stationary. Similarly, 
any discrete distribution for each Vz, creates a process that is WSS but not stationary. 


There are not many important models of noise processes that are WSS but not stationary!”, 
despite the above example and the widespread usage of the term WSS. Rather, the notion of 
wide-sense stationarity is used to make clear, for some results, that they depend only on the 
mean and covariance, thus perhaps making it easier to understand them. 


The Gaussian sinc expansion brings out an interesting theoretical nonsequitur. Assuming that 
ot, > 0, z.e., that the process is not the trivial process for which V(t) = 0 with probability 1 
for all t, the expected energy in the process (taken over all time) is infinite. It is not difficult to 
convince oneself that the sample functions of this process have infinite energy with probability 1. 
Thus stationary noise models are simple to work with, but the sample functions of these processes 
don’t fit into the £2 theory of waveforms that has been developed. Even more important than 
the issue of infinite energy, stationary noise models make unwarranted assumptions about the 


An important exception is interference from other users, which as the above sinc expansion with binary 
samples shows, can be WSS but not stationary. Even in this case, if the interference is modeled as just part of 
the noise (rather than specifically as interference), the nonstationarity is usually ignored. 
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very distant past and future. The extent to which these assumptions affect the results about 
the present is an important question that must be asked. 


The problem here is not with the peculiarities of the Gaussian sinc expansion. Rather it is 
that stationary processes have constant power over all time, and thus have infinite energy. One 
practical solution!’ to this is simple and familiar. The random process is simply truncated in 
any convenient way. Thus, when we say that noise is stationary, we mean that it is stationary 
within a much longer time interval than the interval of interest for communication. This is not 
very precise, and the notion of effective stationarity is now developed to formalize this notion 
of a truncated stationary process. 


7.5.2 Effectively stationary and effectively WSS random processes 


Definition 7.5.3. A (zero-mean) random process is effectively stationary within [—, 72] if the 


292 
joint probability assignment for t),... ,tn is the same as that for t1-+7, te+7,... ,tn+7 whenever 
ty,... ,t, and tj+7,to+7,... ska are all contained in the interval [— fo i Ty =2]. It is effectively 
WSS within |— fo, 40) if K alt, T) is a function only of t — r for t,r € [— fy 40). A random 


process with nonzero mean is effectively stationary (effectively WSS) if its mean is constant 
within [— fo 70) and its fluctuation is effectively stationary (WSS) within [-2, fo), 

One way to view a stationary (WSS) random ee is in the limiting sense of a process that is 
effectively stationary (WSS) for all intervals [—4 mis 40), For operations such as linear functionals 
and filtering, the nature of this limit as Tp becomes large is quite simple and natural, whereas 
for frequency domain results, the effect of finite To is quite subtle. 


For an effectively WSS process within [— am 70), the covariance within [-2, 70] is a function 
of a single parameter, Kz(t,r) = Kz(t — 7) for t,r € [— 40 4). Note however that t — 7 can 
range from —Tp (for t= — 2, r=22) to Tp (for t= 2, r= — Tp), 

To point where t — 7 = —Tp 


2 
To 


PL dl je line where t —7 = cn 
a je fidget where t—7=0O 
To 


li here t— 7 = 2 
ye H- awe aes ine where T 5) 
— fo Zo. __line where t —T = 37) 


To 
me t 


To 

2 

Figure 7.4: The relationship of the two argument covariance function Kz(t, 7) and the 
one argument function Kz(t—r) for an effectively WSS process. Kz(t, 7) is constant on 
each dashed line above. Note that, for example, the line for which t —7 = 3T 0 applies 
only for pairs (t,7) where t > Tp/2 and r < —To/2. Thus Kz(37p) is not necessarily 
equal to Kz(27,0). It can be easily verified, however, that Kz(aTb) = Kz(aTb, 0) 


for alla <1/2, 
13There is another popular solution to this problem. For any C2 function g(t), the energy in g(t) outside 
of [- 2, 7] vanishes as Ty — ov, so intuitively the effect of these tails on the linear functional f g(t) Z(t) dt 


vanishes as To — 0. This provides a nice intuitive basis for ignoring the problem, but it fails, both intuitively and 
mathematically, in the frequency domain. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


220 CHAPTER 7. RANDOM PROCESSES AND NOISE 


Since a Gaussian process is determined by its covariance function and mean, it is effectively 
stationary within [-2, 70) if it is effectively WSS. 


Note that the difference between a stationary and effectively stationary random process for large 
To is primarily a difference in the model and not in the situation being modeled. If two models 
have a significantly different behavior over the time intervals of interest, or more concretely, if 
noise in the distant past or future has a significant effect, then the entire modeling issue should 
be rethought. 


7.5.3 Linear functionals for effectively WSS random processes 


The covariance matrix for a set of linear functionals and the covariance function for the output of 
a linear filter take on simpler forms for WSS or effectively WSS processes than the corresponding 
forms for general processes derived in Subsection 7.4.3. 


Let Z(t) be a zero-mean WSS random process with covariance function Kz(t — 7) for t,7 € 


[-2, 40) and let g(t), ga(t),... , ge(t) be a set of £2 functions nonzero only within [-2, fo), 
For the ee WSS case, we can take Jo = oo. Let the linear functional V,, be given by 
dene 2G m(t) dt for 1 <m < &. The covariance E[V,,V;] is then given by 
20 0° 
E[Vnvj] = E | [in Zante) at J Zerdasen tr 


ao 


z i. [9 ? jG aaa. (7.49) 


Note that this depends only on the covariance where t,r € [— fo fo), i i.e., where {Z(t)} is 
effectively WSS. This is not surprising, since we would not expect V,, to depend on the behavior 


of the process outside of where gm(t) is nonzero. 


7.5.4 Linear filters for effectively WSS random processes 


Next consider passing a random process {Z(t);t € R} through a linear time-invariant filter 
whose impulse response h(t) is £2. As pointed out in 7.28, the output of the filter is a random 
process {V(r);7 € R} given by 


(oe) 


V(t) = / Z(t1)h(7-11) dt}. 


—co 


Note that V(r) is a linear functional for each choice of 7. The covariance function evaluated 
at t,7 is the covariance of the linear functionals V(t) and V(r). Ignoring questions of orders of 
integration and convergence, 


Ky (t,7) = te os h(t—t1)Kz(th, t2)h(7—t2)dty dt. (7.50) 


First assume that {Z(t); t € R} is WSS in the conventional sense. Then Kz(t1,t2) can be 
replaced by Kz(t1—t2). Replacing t;—t2 by s (i.e., t) by tg +8), 


Ket < i - | f ie ed KG) aa | Ae) as 


—co —oo 
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Replacing to by T+, 


Kv(t7) =f es KE Kas yaa. (7.51) 


Thus Ky(t,7) is a function only of t-7. This means that {V(t);t € R} is WSS. This is not 
surprising; passing a WSS random process through a linear time-invariant filter results in another 
WSS random process. 


If {Z(t);t € R} is a Gaussian process, then, from Theorem 7.4.1, {V(t);t € R} is also a Gaussian 
process. Since a Gaussian process is determined by its covariance function, it follows that if Z(t) 
is a stationary Gaussian process, then V(t) is also a stationary Gaussian process. 


We do not have the mathematical machinery to carry out the above operations carefully over 
the infinite time interval'*. Rather, it is now assumed that {Z(t);t € R} is effectively WSS 
within [-2, 70), It will also be assumed that the impulse response h(t) above is time-limited 
in the sense that for some finite A, h(t) = 0 for |t| > A. 

Theorem 7.5.3. Let {Z(t);t € R} be effectively WSS within [-2, 7 and have sample func- 
tions that are Lo within [-2, To] with probability 1. Let Z(t) be the input to a filter with an Lo 
time-limited impulse response {h(t); [—A, A] — R}. Then for fo > A, the output random process 
{V(t);t € R} is WSS within [-2+4A, fo _ Al and its sample functions within [-2+4A, 40 _ A] 
are Lz with probability 1. 


Proof: Let z(t) be a sample function of Z(t) and assume 2(t) is C2 within [-2, 2]. Let 
u(r) = f z(t)h(7 — t) dt be the corresponding filter output. For each 7 € [ ty LA, et A], v(7) 
is determined by z(t) in the range t € [—42, 42]. Thus, if we replace z(t) by zo(t) = z(t)rect[To], 
the filter output, say vo(7) will equal u(r) for 7 € | iu LA, iv A]. The time-limited function 
zo(t) is £1 as well as Lo. This implies that the Fourier transform 29(f) is bounded, say by 


Zo(f) < B, for each f. Since to(f) = 20(f)h(f), we see that 


/ \eo(f) 2 df = / ol APIACAP af < B? / IAC) af < co 


This means that io(f), and thus also vo(t), is £2. Now vo(t), when truncated to [-2+4A, 40 _ A] 
is equal to u(t) truncated to [-2+4A, fo_ Al, so the truncated version of u(t) is £2. Thus the 
sample functions of {V(t)}, truncated to [-2+4A, fo _ Al, are £Lo with probability 1. 


Finally, since {Z(t);t € R} can be truncated to [-2, 2] with no lack of generality, it follows 


a) 
that Kz(t1, t2) can be truncated to t), tg € [-2, To), Thus, for t,7 € [ B +A, B A], (7.50) 
becomes 
To To 
2 2 * 
Ky (t,7) = i h(t—ti)Kz(ti—te)h(7—-te)dtidte. (7.52) 
Bevo fare 
2 2 


The argument in (7.50, 7.51) shows that V(t) is effectively WSS within [-2+A, fo _ A). 


The above theorem, along with the effective WSS result about linear functionals, shows us that 
results about WSS processes can be used within finite intervals. The result in the theorem about 


More important, we have no justification for modeling a process over the infinite time interval. Later, however, 
after building up some intuition about the relationship of an infinite interval to a very large interval, we can use 
the simpler equations corresponding to infinite intervals. 
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the interval of effective stationarity being reduced by filtering should not be too surprising. If 
we truncate a process, and then pass it through a filter, the filter spreads out the effect of the 
truncation. For a finite duration filter, however, as assumed here, this spreading is limited. 


The notion of stationarity (or effective stationarity) makes sense as a modeling tool where Tp 
above is very much larger than other durations of interest, and in fact where there is no need 
for explicit concern about how long the process is going to be stationary. 


The above theorem essentially tells us that we can have our cake and eat it too. That is, 
transmitted waveforms and noise processes can be truncated, thus making use of both common 
sense and £2 theory, but at the same time insights about stationarity can still be relied upon. 
More specifically, random processes can be modeled as stationary, without specifying a specific 
interval [- 2, 7) of effective stationarity, because stationary processes can now be viewed as 
asymptotic versions of finite duration processes. 


Appendices 7A.2 and 7A.3 provide a deeper analysis of WSS processes truncated to an interval. 
The truncated process is represented as a Fourier series with random variables as coefficients. 
This gives a clean interpretation of what happens as the interval size is increased without bound, 
and also gives a clean interpretation of the effect of time-truncation in the frequency domain. 
Another approach to a truncated process is the Karhunen-Loeve expansion, which is discussed 
in 7A.4. 


7.6 Stationary and WSS processes in the Frequency Domain 


Stationary and WSS zero-mean processes, and particularly Gaussian processes, are often viewed 
more insightfully in the frequency domain than in the time domain. An effectively WSS process 
over [-2, 70) has a single variable covariance function Kz(r) defined over [Tp, Ty]. A WSS 
process can be viewed as a process that is effectively WSS for each Jo. The energy in such a 
process, truncated to [- 2, 70), is linearly increasing in 7p, but the covariance simply becomes 
defined over a larger and larger interval as Tj — oo. Assume in what follows that this limiting 


covariance is £2. This does not appear to rule out any but the most pathological processes. 


First we look at linear functionals and linear filters, ignoring limiting questions and convergence 
issues and assuming that 7p is ‘large enough’. We will refer to the random processes as stationary, 
while still assuming £2 sample functions. 


For a zero-mean WSS process {Z(t);t € R} and a real £2 function g(t), consider the linear 
functional V = f[ g(t)Z(t) dt. From (7.49), 


EV] = | a | il Rae ar| dt (7.53) 
z ; ae [Kz « a] (tat. (7.54) 


—co 


where Kz *g denotes the convolution of the waveforms Kz(t) and g(t). Let S'z(f) be the Fourier 
transform of K z(t). The function Sz(f) is called the spectral density of the stationary process 
{Z(t);t € R}. Since Kz(t) is Lo, real, and symmetric, its Fourier transform is also Lo, real, and 
symmetric, and, as shown later, Sz(f) > 0. It is also shown later that Sz(f) at each frequency 
f can be interpreted as the power per unit frequency at f. 


Let 6(t) = [Kz * g](t) be the convolution of Kz and g. Since g and Kz are real, 6(t) is also real 
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so 6(t) = 0*(t). Using Parseval’s theorem for Fourier transforms, 


E[V2] = ‘: ™ 9(t)6*(t) dt = / * a(He (A af. 


—oo —oo 


Since @(t) is the convolution of Kz and g, we sce that 6(f) = Sz(f)g(f). Thus, 


WV =[ ansang at = [aA sear (7.55) 
Note that E[V?] > 0 and that this holds for all real £2 functions g(t). The fact that g(t) is 
real constrains the transform g(f) to satisfy g(f) = g*(—f), and thus |9(f)| = |g(—f)| for all f. 
Subject to this constraint and the constraint that |g(f)| be £2, |g(f)| can be chosen as any Lo 
function. Stated another way, g(f) can be chosen arbitrarily for f > 0 subject to being Lo. 


Since Sz(f) = Sz(—f), (7.55) can be rewritten as 
EV%= f° 2NP SelAat. 
0 


Since E[V?] > 0 and |g(f)| is arbitrary, it follows that Sz(f) > 0 for all f ER. 

The conclusion here is that the spectral density of any WSS random process must be nonnegative. 
Since Sz(f) is also the Fourier transform of K(t), this means that a necessary property of any 
single variable covariance function is that it have a nonnegative Fourier transform. 


Next, let Vin = { gm(t)Z(t) dt where the function gm(t) is real and Le for m = 1,2. From (7.49), 


civ] = [alo |f Relt—roatr)ar] at (7.56) 


—co —co 


= / : out) IK P 92| (t) dt. (7.57) 


Let 9m(f) be the Fourier transform of g,,(t) for m = 1,2, and let @(t) = [Kz(t) * ga|(t) be the 
convolution of Kz and go. Let 0(f) = Sz(f)go(f) be its Fourier transform. As before, we have 


EIV,Vs] = / al f)6"(f) df = i al f)S2(f)93(f) af. (7.58) 


There is a remarkable feature in the above expression. If 9;(f) and go(f) have no overlap in 
frequency, then E[V; V2] = 0. In other words, for any stationary process, two linear functionals 
over different frequency ranges must be uncorrelated. If the process is Gaussian, then the linear 
functionals are independent. This means in essence that Gaussian noise in different frequency 
bands must be independent. That this is true simply because of stationarity is surprising. 
Appendix 7A.3 helps to explain this puzzling phenomenon, especially with respect to effective 
stationarity. 


Next, let {¢m(t); m € Z} be a set of real orthonormal functions and let {¢m(f)} be the corre- 
sponding set of Fourier transforms. Letting Vi, = [ Z(t)¢m(t) dt, (7.58) becomes 


Vv fl bm(F)S2( ABSA) df. (7.59) 
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If the set of orthonormal functions {¢,,(t); m € Z} is limited to some frequency band, and if 
Sz(f) is constant, say with value No/2 in that band, then 


E(VinV5] = No/2 f dm( FOGLE at (7.60) 
By Parseval’s theorem for Fourier transforms, we have [ bm f )%( f) df = 5m;, and thus 
EVV] = Gms. (7.61) 


The rather peculiar looking constant No/2 is explained in the next section. For now, however, 
it is possible to interpret the meaning of the spectral density of a noise process. Suppose that 
Sz(f) is continuous and approximately constant with value Sz(f-) over some narrow band of 
frequencies around f, and suppose that ¢(t) is constrained to that narrow band. Then the 
variance of the linear functional [™. Z(t)¢1(t) dt is approximately Sz(f-). In other words, 
Sz(f-) in some fundamental sense describes the energy in the noise per degree of freedom at the 
frequency f.. The next section interprets this further. 


7.7 White Gaussian noise 


Physical noise processes are very often reasonably modeled as zero mean, stationary, and Gaus- 
sian. There is one further simplification that is often reasonable. This is that the covariance 
between the noise at two epochs dies out very rapidly as the interval between those epochs 
increases. The interval over which this covariance is significantly nonzero is often very small 
relative to the intervals over which the signal varies appreciably. This means that the covariance 
function Kz(r) looks like a short-duration pulse around 7 = 0. 


We know from linear system theory that [Kz(t — T)g(r)dr is equal to g(t) if Kz(t) is a unit 
impulse. Also, this integral is approximately equal to g(t) if Kz(t) has unit area and is a narrow 
pulse relative to changes in g(t). It follows that under the same circumstances, (7.56) becomes 


EViVi] = | i gi(t)K a(t — 1)go(r) dr dt ‘ gn(t)go(t) at. (7.62) 


This means that if the covariance function is very narrow relative to the functions of interest, then 
its behavior relative to those functions is specified by its area. In other words, the covariance 
function can be viewed as an impulse of a given magnitude. We refer to a zero-mean WSS 
Gaussian random process with such a narrow covariance function as White Gaussian Noise 
(WGN). The area under the covariance function is called the intensity or the spectral density 
of the WGN and is denoted by the symbol No/2. Thus, for £2 functions gi(t), go(t),... in 
the range of interest, and for WGN (denoted by {W(t);¢ € R}) of intensity No/2, the random 
variable V,, = { W(t)gm(t) dt has the variance 


E[V2] = (No/2) / 42,(t) dt. (7.63) 


Similarly, the random variables V; and V,, have the covariance 


ELVjVin] = (No/2) | gi(t)gm(t) a (7.64) 
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Also Vi, V2,... are jointly Gaussian. 


The most important special case of (7.63) and (7.64) is to let @;(t) be a set of orthonormal 
functions and let W(t) be WGN of intensity No/2. Let Vn = f dm(t)W(t) dt. Then, from (7.63) 
and (7.64), 


E[ViVin] = (No/2)djm- (7.65) 


This is an important equation. It says that if the noise can be modeled as WGN, then when 
the noise is represented in terms of any orthonormal expansion, the resulting random variables 
are iid. Thus, we can represent signals in terms of an arbitrary orthonormal expansion, and 
represent WGN in terms of the same expansion, and the result is iid Gaussian random variables. 


Since the coefficients of a WGN process in any orthonormal expansion are iid Gaussian, it is 
common to also refer to a random vector of iid Gaussian rv’s as WGN. 


If Kw(t) is approximated by (No/2)d(t), then the spectral density is approximated by Sw(f) = 
No/2. If we are concerned with a particular band of frequencies, then we are interested in 
Sw(f) being constant within that band, and in this case, {W(t);t € R} can be represented as 
white noise within that band. If this is the only band of interest, we can model! Sw(f) as 
equal to No/2 everywhere, in which case the corresponding model for the covariance function is 
(No/2)6(t). 

The careful reader will observe that WGN has not really been defined. What has been said, 
in essence, is that if a stationary zero-mean Gaussian process has a covariance function that 
is very narrow relative to the variation of all functions of interest, or a spectral density that 
is constant within the frequency band of interest, then we can pretend that the covariance 
function is an impulse times No/2, where No/2 is the value of Sw(f) within the band of 
interest. Unfortunately, according to the definition of random process, there cannot be any 
Gaussian random process W(t) whose covariance function is K(t) = (No/2)6(t). The reason for 
this dilemma is that E[W?(t)] = Kw(0). We could interpret K w(0) to be either undefined or 
oo, but either way, W(t) cannot be a random variable (although we could think of it taking on 
only the values plus or minus oo). 


Mathematicians view WGN as a generalized random process, in the same sense as the unit 
impulse 5(t) is viewed as a generalized function. That is, the impulse function 4(t) is not viewed 
as an ordinary function taking the value 0 for t 4 0 and the value co at t = 0. Rather, it is viewed 
in terms of its effect on other, better behaved, functions g(t), where [~~ g(t)d(t) dt = g(0). In 
the same way, WGN is not viewed in terms of random variables at each epoch of time. Rather 
it is viewed as a generalized zero-mean random process for which linear functionals are jointly 
Gaussian, for which variances and covariances are given by (7.63) and (7.64), and for which the 
covariance is formally taken to be (No/2)d(t). 


Engineers should view WGN within the context of an overall bandwidth and time interval of 
interest, where the process is effectively stationary within the time interval and has a constant 
spectral density over the band of interest. Within that context, the spectral density can be 
viewed as constant, the covariance can be viewed as an impulse, and (7.63) and (7.64) can be 
used. 


The difference between the engineering view and the mathematical view is that the engineering 
view is based on a context of given time interval and bandwidth of interest, whereas the math- 


15This is not at obvious as it sounds, and will be further discussed in terms of the theorem of irrelevance in the 
next chapter. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


226 CHAPTER 7. RANDOM PROCESSES AND NOISE 


ematical view is based on a very careful set of definitions and limiting operations within which 
theorems can be stated without explicitly defining the context. Although the ability to prove 
theorems without stating the context is valuable, any application must be based on the context. 


7.7.1 The sinc expansion as an approximation to WGN 


Theorem 7.5.2 treated the process Z(t) = >, Z,sinc (SF) where each rv {Z;;k € Z} is iid 
and N(0, oO). We found that the process is zero-mean Gaussian and stationary with covariance 


function Kz(t — T) = o*sinc( 4+). The spectral density for this process is then given by 


S2(f) = 0°T rect(fT). (7.66) 


This process has a constant spectral density over the baseband bandwidth W = 1/(2T), so by 
making T sufficiently small, the spectral density is constant over a band sufficiently large to 
include all frequencies of interest. Thus this process can be viewed as WGN of spectral density 
oe = o°T for any desired range of frequencies W = 1/(27') by making T sufficiently small. Note, 
however, that to approximate WGN of spectral density No/2, the noise power, i.e., the variance 
of Z(t) is 7? = WNp. In other words, o? must increase with increasing W. This also says that No 
is the noise power per unit positive frequency. The spectral density, No/2, is defined over both 
positive and negative frequencies, and so becomes No when positive and negative frequencies 
are combined as in the standard definition of bandwidth’®. 


If a sinc process is passed through a linear filter with an arbitrary impulse response h(t), the 
output is a stationary Gaussian process with spectral density |h( f)|?o?T rect(fT). Thus, by 
using a sinc process plus a linear filter, a stationary Gaussian process with any desired non- 
negative spectral density within any desired finite bandwith can be generated. In other words, 
stationary Gaussian processes with arbitrary covariances (subject to S(f) > 0 can be generated 
from orthonormal expansions of Gaussian variables. 


Since the sinc process is stationary, it has sample waveforms of infinite energy. As explained in 
subsection 7.5.2, this process may be truncated to achieve an effectively stationary process with 
£2 sample waveforms. Appendix 7A.3 provides some insight about how an effectively stationary 
Gaussian process over an interval Jp) approaches stationarity as Tg — oo. 


The sinc process can also be used to understand the strange, everywhere uncorrelated, process 
in Example 7.4.2. Holding o? = 1 in the sinc expansion as T approaches 0, we get a process 
whose limiting covariance function is 1 for t—7 = 0 and 0 elsewhere. The corresponding limiting 
spectral density is 0 everywhere. What is happening is that the power in the process (i.e., Kz(0)) 
is 1, but that power is being spread over a wider and wider band as JT’ — 0, so the power per 
unit frequency goes to 0. 


To explain this in another way, note that any measurement of this noise process must involve 
filtering over some very small but nonzero interval. The output of this filter will have zero 
variance. Mathematically, of course, the limiting covariance is £2-equivalent to 0, so again the 
mathematics!’ corresponds to engineering reality. 


'6QOne would think that this field would have found a way to be consistent about counting only positive 
frequencies or positive and negative frequencies. However, the word bandwidth is so widely used among the 
mathophobic, and Fourier analysis is so necessary for engineers, that one must simply live with such minor 
confusions. 

'TThis process also can not be satisfactorily defined in a measure theoretic way. 
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7.7.2 Poisson process noise 


The sinc process of the last subsection is very convenient for generating noise processes that 
approximate WGN in an easily used formulation. On the other hand, this process is not very 
believable!® as a physical process. A model that corresponds better to physical phenomena, 
particularly for optical channels, is a sequence of very narrow pulses which arrive according to 
a Poisson distribution in time. 


The Poisson distribution, for our purposes, can be simply viewed as a limit of a discrete time 
process where the time axis is segmented into intervals of duration A and a pulse of width A 
arrives in each interval with probability Ap, independent of every other interval. When such a 
process is passed through a linear filter, the fluctuation of the output at each instant of time is 
approximately Gaussian if the filter is of sufficiently small bandwidth to integrate over a very 
large number of pulses. One can similarly argue that linear combinations of filter outputs tend 
to be approximately Gaussian, making the process an approximation of a Gaussian process. 


We do not analyze this carefully, since our point of view is that WGN, over limited bandwidths, 
is a reasonable and canonic approximation to a large number of physical noise processes. After 
understanding how this affects various communication systems, one can go back and see whether 
the model is appropriate for the given physical noise process. When we study wireless commu- 
nication, we will find that the major problem is not that the noise is poorly approximated by 
WGN, but rather that the channel itself is randomly varying. 


7.8 Adding noise to modulated communication 


Consider the QAM communication problem again. A complex £2 baseband waveform u(t) is 
generated and modulated up to passband as a real waveform a(t) = 2R[u(t)e?"/-']. A sample 
function w(t) of a random noise process W(t) is then added to x(t) to produce the output 
y(t) = x(t)+w(t), which is then demodulated back to baseband as the received complex baseband 
waveform v(t). 


Generalizing QAM somewhat, assume that u(t) is given by u(t) = )>, uxOx(t) where the func- 
tions 0,(t) are complex orthonormal functions and the sequence of symbols {uz;k € Z} are 
complex numbers drawn from the symbol alphabet and carrying the information to be trans- 
mitted. For each symbol uz, R(uz,) and S(uz) should be viewed as sample values of the random 
variables R(U;,) and S(U;,). The joint probability distributions of these random variables is 
determined by the incoming random binary digits and how they are mapped into symbols. The 
complex random variable!? R(U,) + iS(Uz,) is then denoted by Ug. 


In the same way, R(>>, Urz(t)) and S(>°, Up6z(t)) are random processes denoted respec- 


18To many people, defining these sinc processes with their easily analyzed properties but no physical justification, 
is more troublesome than our earlier use of discrete memoryless sources in studying source coding. Actually, the 
approach to modeling is the same in each case: first understand a class of easy-to-analyze but perhaps impractical 
processes, then build on that understanding to understand practical cases. Actually, sinc processes have an 
advantage here: the band limited statationary Gaussian random processes defined this way (although not the 
method of generation) are widely used as practical noise models, whereas there are virtually no uses of discrete 
memoryless sources as practical source models. 

Recall that a random variable (rv) is a mapping from sample points to real numbers, so that a complex rv is 
a mapping from sample points to complex numbers. Sometimes in discussions involving both rv’s and complex 
rv’s, it helps to refer to rv’s as real rv’s, but the modifier ‘real’ is superflous. 
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tively by R(U(t)) and S(U(t)). We then call U(t) = R(U(t)) + 13(U(t)) for t € R a com- 
plex random process. A complex random process U(t) is defined by the joint distribution of 
U(t1), U(te),... ,U (tn) for all choices of n,t1,... ,tn. This is equivalent to defining both R(U(t)) 
and S(U(t)) as joint processes. 

Recall from the discussion of the Nyquist criterion that if the QAM transmit pulse p(t) is 
chosen to be square-root of Nyquist, then p(t) and its T-spaced shifts are orthogonal and can be 
normalized to be orthonormal. Thus a particularly natural choice here is 6;,(t) = p(t — kT) for 
such a p. Note that this is a generalization of the previous chapter in the sense that {U,; k € Z} 
is a sequence of complex rv’s using random choices from the signal constellation rather than some 
given sample function of that random sequence. The transmitted passband (random) waveform 
is then 


X(t) = S° 2K {Up g(t) exp[2r? fet}} . (7.67) 
k 


Recall that the transmitted waveform has twice the power of the baseband waveform. Now 
define 


Vei(t) = R{26,(t) exp[27rif.t]} ; 
Pr2(t) = S{-26,(t) exp[27ri fet] . 


Also, let Ux = R(U,) and Ugg = S(U,z). Then 


X(t) = S—[Un advert) + Uraea(t)]. 


k 


As shown in Theorem 6.6.1, the set of bandpass functions {~,0;k € Z,& € {1,2}} are orthogonal 
and each have energy equal to 2. This again assumes that the carrier frequency f,. is greater 
than all frequencies in each baseband function 6;(t). 


In order for u(t) to be £2, assume that the number of orthogonal waveforms 6;,(t) is arbitrarily 
large but finite, say 61(t),... ,@n(t). Thus {w%¢} is also limited tol <k <n. 


Assume that the noise {W(t);t € R} is white over the band of interest and effectively stationary 
over the time interval of interest, but has £2 sample functions’. Since {x 7;1 < k <n, = 1,2} 
is a finite real orthogonal set, the projection theorem can be used to express each sample noise 
waveform {w(t);t € R} as 


n 


w(t) = S [zee (t) + Z,2Vz,2(t)] + wid), (7.68) 


k=1 


where w(t) is the component of the sample noise waveform perpendicular to the space spanned 
by {¢ei31 < k < n,@ = 1,2}. Let Z,¢ be the rv with sample value z,. Then each rv Zz ¢ 
is a linear functional on W(t). Since {731 < k < n,@ = 1,2} is an orthogonal set, the 
rv’s Zz ¢ are iid Gaussian rv’s. Let W,(t) be the random process corresponding to the sample 
function w,(t) above. Expanding {W _(t);¢ € R} in an orthonormal expansion orthogonal to 
(Wrul<k <n,@= 1,2}, the coefficients are assumed to be independent of the Z;,¢, at least 


20Since the set of orthogonal waveforms 6;(t) are not necessarily time or frequency limited, the assumption 
here is that the noise is white over a much larger time and frequency interval than the nominal bandwidth and 
time interval used for communication. This assumption is discussed further in the next chapter. 
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over the time and frequency band of interest. What happens to these coefficients outside of the 
region of interest is of no concern, other than assuming that W, (t) is independent of U,¢ and 
Zre for l1<k<nand ¢= {1,2}. The received waveform Y(t) = X(t) + W(t) is then 


= ail (Un it+Ze,1) Ve r(t) + (Uk,2+Ze,2) Vro(t)] + W(t). 
pat 


When this is demodulated,?! the baseband waveform is represented as the complex waveform 


V(t) = S\ (Un + Zx)On(t) + Z(t). (7.69) 
k 


where each Z;, is a complex rv given by Z, = Zp,1 +iZp,2 and the baseband residual noise Z, (t) 
is independent of {U,, Z,; 1 < k <n}. The variance of each real rv Z;,; and Z;,2 is taken by 
convention to be No/2. We follow this convention because we are measuring the input power 
at baseband; as mentioned many times, the power at passband is scaled to be twice that at 
baseband. The point here is that No is not a physical constant - rather it is the noise power per 
unit positive frequency in the units used to represent the signal power. 


7.8.1 Complex Gaussian random variables and vectors 


Noise waveforms, after demodulation to baseband, are usually complex and are thus represented, 
as in (7.69), by a sequence of complex random variables, best regarded as a complex random 
vector (rv). It is possible to view any such n dimensional complex rv Z = Zye + iZim as a 2n 
‘ : Z 
dimensional real rv Z.. | where Zre = KR(Z) and Zim = S(Z). 
1m 

For many of the same reasons that it is desirable to work directly with a complex baseband 
waveform rather than a pair of real passband waveforms, it is often beneficial to work directly 
with complex rv’s. 


Definition 7.8.1. A complex random variable Z = Z,~. + iZim is Gaussian if Ze and Zim are 
jointly Gaussian; Z is circularly-symmetric Gaussian? if it is Gaussian and Zr. and Zim are 
zero mean and iid. 


The amplitude of a circularly-symmetric Gaussian rv is Rayleigh distributed and the phase is 
uniform, #.e., it has circular symmetry. A circularly-symmetric Gaussian rv Z is fully described 
by its variance 0? = E[ZZ*] and is denoted as Z ~ CN (0,07). Note that the real and imaginary 
parts of Z are then iid with variance 07/2 each. 


Definition 7.8.2. A complex random vector (rv) Z = (Z,... ,Zn)' is jointly Gaussian if the 
2n real and imaginary components of Z are jointly Gaussian. It is circularly symmetric if the 
distribution of Z (i.e., the joint distribution of the real and imaginary parts) is the same as that 
of e’”Z for all phase angles 6. It is circularly -symmetric Gaussian if it is jointly-Gaussian and 
circularly symmetric. 


lSome filtering is necessary before demodulation to remove the residual noise that is far out of band, but we 
do not want to analyze that here. 
22This is sometimes referred to as complex proper Gaussian. 
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Example 7.8.1. An important example of a circularly-symmetric Gaussian rv is W = 
(Wi,...,Wn)' where the components W;,1 < k < n are statistically independent and each 
is CN (0,1). Since each W;, is CN(0,1), it can be seen that e’?W;, has the same distribution 
as W;,. Using the independence, it can be seen that e’? W then has the same distribution as 
W. The 2n real and imaginary components of W are iid and N(0, 1/2) so that the probability 
density is 


yo -houP (7.70) 


where we have used the fact that |w,|? = R(w,)? + S(wz)? for each k to replace a sum over 2n 
terms with a sum over n terms. 


Definition 7.8.3. The covariance matrix Kz and the pseudo-covariance matrix Mz of a zero- 
mean complex rv Z = (Zj,... ,Z,)' are the n by n matrices given respectively by 


Kz =E[ZZ"] Mz =E[Z2Z"}, (7.71) 
where Z' is the the conjugate of the transpose, Z™. 


For real zero-mean random vectors, the covariance matrix specifies all the second moments, and 
thus in the jointly-Gaussian case, specifies the distribution. For complex rv’s, both Kz and Mz 
combine to specify all the second moments. Specifically, a little calculation shows that 


E[R( Zp) R(Z;)] = gRKz(k, J) + Mz(k, J) E[S(Zx)3(Z;)] = 3R[Kz(k, 7) — Mz(k, J)] 


E[R(Zp)3(Zj)] = g8[-Ka(k,j) +Mz(k,j)] — E[S(Z4)%(Z;)] = 33[Kz(k, 3) + Ma(k, 9) 


When Z is a zero-mean, complex jointly-Gaussian rv then Kz and Mz specify the distribution 
of Z, and thus Z is circularly-symmetric Gaussian if and only if Kz = K,iez and Mz = M,ioz 
for all phases 6. Calculating these matrices for an arbitrary rv, 


Koz =Ele’Z-e PZ =Kz;  Mewz = Ele” Z- eZ") =e Mz 


Thus, K,iez is always equal to Kz but M,ieg is equal to Mz for all real @ if and only if Mz is 
the zero matrix. We have proven the following theorem. 


Theorem 7.8.1. A zero-mean, complex jointly-Gaussian rv is circularly-symmetric Gaussian 
if and only if the pseudo-covariance matrix Mz is 0. 


Since Mz is zero for any circularly-symmetric Gaussian rv Z, the distribution of Z is determined 
solely by Kz and is denoted as Z ~ CN(0,Kz) where C denotes that Z is both complex and 
circularly symmetric. The complex normalized iid rv of Example 7.8.1 is thus denoted as 
W ~CN(0,In). 


The following two examples illustrate some subtleties in Theorem 7.8.1. 


Example 7.8.2. Let Z = (Z1, Z2)' where Z, ~ CN (0,1) and Z. = UZ, where U is statistically 
independent of Z; and has possible values +1 with probability 1/2 each. It is easy to see that 
Z2 ~ CN(0,1), but the real and imaginary parts of Z, and Z2 together are not jointly Gaussian. 
In fact, the joint distribution of R(Z1) and R(Z2) is concentrated on the two diagonal axes and 
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9(Z1) and S(Z 2) is similarly distributed. Thus, Z is not jointly Gaussian, and the theorem 
doesn’t apply. Even though Z and Z» are individually circularly-symmetric Gaussian, Z is not 
circularly-symmetric Gaussian. In this example, it turns out that Z is circularly symmetric 
and Mz = le ae The example could be changed slightly, changing the definition of Z2 to 
R(Zq) = UR(Z1) and S(Z2) ~ N(0,1/2), where 3(Z) is statistically independent of all the 
other variables. Then Mz is still 0, but Z is not circularly symmetric. Thus, without the 
jointly-Gaussian property, the pseudo-covariance matrix does not specify whether Z is circularly 
symmetric. 


Example 7.8.3. Consider a vector Z = (Z,,Z2)' where Z, ~ CN(0,1) and Z = Z}. Since 
R(Z2) = R(Z1) and S(Z_) = —S(Z1), we see that the four real and imaginary components 
of Z are jointly Gaussian, so Z is complex jointly Gaussian and the theorem applies. We see 
that Mz = leas and thus Z is jointly Gaussian but not circularly symmetric. This makes 
sense, since when Z; is real (or approximately real), Z2 = Z, (or Z2 & Z,) and when Z; is 
pure imaginary (or close to pure imaginary), Z is the negative of Z, (or Z2 ~ —Z,). Thus the 


relationship of Z to Z, is certainly not phase invariant. 


What makes this example interesting is that both Z,; ~ CN (0,1) and Z2 ~ CN(0,1). Thus, as in 
Example 7.8.2, it is the relationship between Z, and Z2 that breaks up the circularly-symmetric 
Gaussian property. Here it is the circular symmetry that causes the problem, whereas in Example 
7.8.2 it was the lack of a jointly-Gaussian distribution. 


In Section 7.3, we found that an excellent approach to real jointly-Gaussian rv’s was to view 
them as linear transformations of a rv with iid components, each \V(0,1). We will find here that 
the same approach applies to circularly-symmetric Gaussian vectors. Thus let A be an arbitrary 
complex m by n matrix and let the complex rv Z = (Z,..., Zn)" be defined by 


Z=AW, (7.72) 


where W ~ CN(0,lm). The complex rv defined in this way has jointly Gaussian real and 
imaginary parts. To see this, represent (7.72) as the following real linear transformation of 2n 


real space: 
LZre _ Are —Aim Wire 
| Zim | 7 | Aim Are | | Wim | oe) 
where Zre = R(Z), Zim = S(Z), Are = R(A), and Aim = S(A). 
The rv Z is also circularly symmetric.?? To see this, note that 
Kz =E[|AW W'A)] = AAI Mz =E[|AW W'A'™] =0 (7.74) 
Thus from Theorem 7.8.1, Z is circularly-symmetric Gaussian and Z ~ CN(0, AA‘). 
This proves the if part of the following theorem. 


Theorem 7.8.2. A complez rv Z is circularly-symmetric Gaussian if and only if it can be 


expressed as Z = AW for a complex matrix A and an tid circularly-symmetric Gaussian rv 
W ~CN(0, In). 


?3 Conversely, as we will see later, all circularly symmetric jointly-Gaussian rv’s can be defined this way. 
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Proof: Let Z ~ Kz be an arbitrary circularly-symmetric Gaussian rv. From Appendix 7A.1, 
Kz can be expressed as 


Kz = QAQ™!, (7.75) 


where Q is unitary and its columns are orthonormal eignevectors of Kz The matrix A is diagonal 
and its entries are the eignevalues of Kz, all of which are nonnegative. We can then express Z 
as Z =RW where R= QVAQ™! and W~ CN (0,1). 


Next note that any linear functional, say V = b'Z of a circularly-symmetric Gaussian rv 
Z can be expressed as V = (b'A)W and is thus a circularly symmetric random variable. 


In particular, for each orthonormal eigenfunction q, of Kz, we see that q.Z = (Z,qx) isa 
circularly-symmetric rv. Furthermore, using (7.75), it is easy to show that these variables are 
uncorrelated, and in particular, 


El(Z, gk) (Z, qj)" | = AR Ok,j 


Since these rv’s are jointly Gaussian, this also means that they are statistically independent. 
From the projection theorem, any sample value z of the rv Z can be represented as z = 
yy (S qj)Z, So we also have 


Z=S (4, 45)4; (7.76) 
J 


This represents Z as an orthonormal expansion whose coefficients, (Z,qj;) are independent 
circularly-symmetric Gaussian rv’s. The probability density of Z is then simply the probability 
density of the sequence of coefficients.24 Remembering that each circularly-symmetric Gaussian 
rv (Z, qx) corresponds to two independent real rv’s with variance \;/2, the resulting density, 
assuming that all eigenvalues are positive is 


fz(z) = II a exp (-l(z, q3) 5") (7.77) 
j=l 


This is the density of n independent circularly-symmetric Gaussian random variables, 
((Z,q1),---;(Z,@n)) with variances \1,... ,An respectively. This is the same as the analogous 
result for jointly-Gaussian real random vectors which says that there is always an orthonormal 
basis in which the variables are Gaussian and independent. This analogy forms the simplest 
way to (sort of) visualize circularly-symmetric Gaussian vectors — they have the same kind of 
elliptical symmetry as the real case, except that here, each complex random variable is also 
circularly symmetric. 


It is often more convenient to express fz for Z ~ CN (0, Kz directly in terms of Kz. Recognizing 
that Kz! = QA~'Q°}, (7.77) becomes 


fa(z) exp(—z'K7'z). (7.78) 


1 
7” det(Kz) 


It should be clear that (7.77) or (7.78) are also if-and-only-if conditions for circularly-symmetric 
jointly-Gaussian random vectors with a positive-definite covariance matrix. 


24This relies on the ‘obvious’ fact that incremental volume is the same in any orthonormal basis. The sceptical 
reader, with some labor, can work out the probability density in R?” and then transform to C”. 
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7.9 Signal to noise ratio 


There are a number of different measures of signal power, noise power, energy per symbol, energy 
per bit, and so forth, which are defined here. These measures are explained in terms of QAM 
and PAM, but they also apply more generally. In the previous section, a fairly general set of 
orthonormal functions was used, and here a specific set is assumed. Consider the orthonormal 
functions pz(t) = p(t — kT) as used in QAM, and use a nominal passband bandwidth W = 1/T. 
Each QAM symbol U; can be assumed to be iid with energy Es = E||U,|?]. This is the signal 
energy per real component plus imaginary component. The noise energy per real plus imaginary 
component is defined to be No. Thus the signal to noise ratio is defined to be 


E, 
SNR=—* for QAM. (7.79) 
No 


For baseband PAM, using real orthonormal functions satisfying p;,(t) = p(t — kT), the signal 
energy per symbol is E, = E[|U,|?]. Since the symbol is one dimensional, i.e., real, the noise 
energy in this single dimension is defined to be No/2. Thus SNR is defined to be 

2Es 


NR = 
ONE 


for PAM. (7.80) 


For QAM there are W complex degrees of freedom per second, so the signal power is given by 
P= E,W. For PAM at baseband, there are 2W degrees of freedom per second, so the signal 
power is P = 2E,W. Thus in each case, the SNR becomes 


SNR = <a for QAM and PAM. (7.81) 
We can interpret the denominator here as the overall noise power in the bandwidth W, so SNR 
is also viewed as the signal power divided by the noise power in the nominal band. For those 
who like to minimize the number of formulas they remember, all of these equations for SNR 
follow from a basic definition as the signal energy per degree of freedom divided by the noise 
energy per degree of freedom. 


PAM and QAM each use the same signal energy for each degree of freedom (or at least for each 
complex pair of degrees of freedom), whereas other systems might use the available degrees of 
freedom differently. For example, PAM with baseband bandwidth W occupies bandwidth 2W if 
modulated to passband, and uses only half the available degrees of freedom. For these situations, 
SNR can be defined in several different ways depending on the context. As another example, 
frequency hopping is a technique used both in wireless and in secure communication. It is the 
same as QAM, except that the carrier frequency f, changes pseudo-randomly at intervals long 
relative to the symbol interval. Here the bandwidth W might be taken as the bandwidth of the 
underlying QAM system, or might be taken as the overall bandwidth within which f, hops. The 
SNR in (7.81) is quite different in the two cases. 


The appearance of W in the denominator of the expression for SNR in (7.81) is rather surprising 
and disturbing at first. It says that if more bandwidth is allocated to a communication system 
with the same available power, then SNR decreases. This is best interpreted by viewing SNR in 
terms of signal to noise energy per degree of freedom. As the number of degrees of freedom per 
second increases, the SNR decreases, but the available number of degrees of freedom increases. 
We will later see that the net gain is positive. 
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Another important parameter is the rate R; this is the number of transmitted bits per second, 
which is the number of bits per symbol, log, |.A|, times the number of symbols per second. Thus 


R=Whlog,|A|, for QAM; R= 2Wlog,|A|, for PAM. (7.82) 


An important parameter is the spectral efficiency of the system, which is defined as p = R/W. 
This is the transmitted number of bits/sec in each unit frequency interval. For QAM and PAM, 
p is given by (7.82) to be 


p=log,|A|, for QAM; p=2log,|A|, for PAM. (7.83) 


More generally the spectral efficiency p can be defined as the number of transmitted bits per 
degree of freedom. From (7.83), achieving a large value of spectral efficiency requires making 
the symbol alphabet large; Note that p increases only logarithmically with | A]. 


Yet another parameter is the energy per bit Ey. Since each symbol contains log, A bits, Ey is 
given for both QAM and PAM by 


Es 


Ey, = —————_. 7.84 
= og Al ee) 


One of the most fundamental quantities in communication is the ratio E,/No. Both E, and 
No are measured in the same way, so the ratio is dimensionless, and it is the ratio that is 
important rather than either alone. Finding ways to reduce F,/No is important, particularly 
where transmitters use batteries. For QAM, we substitute (7.79) and (7.83) into (7.84), getting 

Ey SNR 


—= 7.85 
Ne 0 (7.85) 
The same equation is seen to be valid for PAM. This says that achieving a small value for E,/No 


requires a small ratio of SNR to p. We look at this next in terms of channel capacity. 


One of Shannon’s most famous results was to develop the concept of the capacity C of an 
additive WGN communication channel. This is defined as the supremum of the number of bits 
per second that can be transmitted and received with arbitrarily small error probability. For 
the WGN channel with a constraint W on the bandwidth and a constraint P on the received 
signal power, he showed that 


P 
= W1 1+ —— }. . 


He showed that any rate R < C could be achieved with arbitrarily small error probability by 
using channel coding of arbitrarily large constraint length. He also showed, and later results 
strengthened, the fact that larger rates would lead to larger error probabilities. This result will 
be demonstrated in the next chapter. This result is widely used as a benchmark for comparison 
with particular systems. Figure 7.5 shows a sketch of C as a function of W. Note that C 
increases monotonically with W, reaching a limit of (P/No) logy e as W — oo. This is known as 
the ultimate Shannon limit on achievable rate. Note also that when W = P/N, i.e., when the 
bandwidth is large enough for the SNR to reach 1, then C is within 1/ log, e, which is 69%, of 
the ultimate Shannon limit. 
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(P/No) logs e 


W 
Figure 7.5: Capacity as a function of bandwidth W for fixed P/No. 


For any achievable rate, R, 7.e., any rate at which the error probability can be made arbitrarily 
small by coding and other clever strategems, the theorem above says that R < C. If we rewrite 
(7.86), substituting SNR for P/(WNpo) and substituting p for R/W, we get 


p < logo(1+SNR). (7.87) 


If we substitute this into (7.85), we get 


Ey SNR 


This is a monotonic increasing function of the single variable SNR, which in turn is decreasing in 
W. Thus (£»/No)min is monotonic decreasing in W. As W — ov it reaches the limit In 2 = 0.693, 
i.e., -1.59 dB. As W decreases, it grows, reaching 0 dB at SNR =1, and increasing without bound 
for yet smaller W. The limiting spectral efficiency, however, is C/W. This is also monotonic 
decreasing in W, going to 0 as W — co. In other words, there is a trade-off between E;/No 
(which we would like to be small) and spectral efficiency (which we would like to be large). This 
is further discussed in the next chapter. 


7.10 Summary of Random Processes 


The additive noise in physical communication systems is usually best modeled as a random 
process, 7.e., a collection of random variables, one at each real-valued instant of time. A random 
process can be specified by its joint probability distribution over all finite sets of epochs, but 
additive noise is most often modeled by the assumption that the random variables are all zero- 
mean Gaussian and their joint distribution is jointly Gaussian. 


These assumptions were motivated partly by the central limit theorem, partly by the simplicity 
of working with Gaussian processes, partly by custom, and partly by various extremal properties. 
We found that jointly Gaussian means a great deal more than individually Gaussian, and that 
the resulting joint densities are determined by the covariance matrix. These densities have 
ellipsoidal equiprobability contours whose axes are the eigenfunctions of the covariance matrix. 


A sample function, say Z(t,w) of a random process Z(t) can be viewed as a waveform and 
interpreted as an £2 vector. For any fixed £2 function g(t), the inner product (g(t), Z(t,w)) 
maps w into a real number and thus can be viewed over 2 as a random variable. This rv is called 
a linear function of Z(t) and is denoted by f g(t)Z(t) dt. 
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These linear functionals arise when expanding a random process into an orthonormal expansion 
and also at each epoch when a random process is passed through a linear filter. For simplic- 
ity these linear functionals and the underlying random processes are not viewed in a measure 
theoretic form, although the £2 development in Chapter 4 provides some insight about the 
mathematical subtleties involved. 


Noise processes are usually viewed as being stationary, which effectively means that their statis- 
tics do not change in time. This generates two problems - first that the sample functions have 
infinite energy and second that there is no clear way to see whether results are highly sensitive 
to time-regions far outside the region of interest. Both of these problems are treated by defining 
effective stationarity (or effective wide-sense stationarity) in terms of the behavior of the process 
over a finite interval. This analysis shows, for example, that Gaussian linear functionals depend 
only on effective stationarity over the region of interest. From a practical standpoint, this means 
that the simple results arising from the assumption of stationarity can be used without concern 
for the process statistics outside the time-range of interest. 


The spectral density of a stationary process can also be used without concern for the process 
outside the time-range of interest. If a process is effectively WSS, it has a single variable 
covariance function corresponding to the interval of interest, and this has a Fourier transform 
which operates as the spectral density over the region of interest. How these results change as 
the region of interest approaches oo is explained in Appendix 7A.3. 


7A Appendix: Supplementary topics 


7A.1 Properties of covariance matrices 


This appendix summarizes some properties of covariance matrices that are often useful but not 
absolutely critical to our treatment of random processes. Rather than repeat everything twice, 
we combine the treatment for real and complex rv together. On a first reading, however, one 
might assume everything to be real. Most of the results are the same in each case, although 
the complex-conjugate signs can be removed in the real case. It is important to realize that the 
properties developed here apply to nonGaussian as well as Gaussian rv’s. All rv’s and rv’s here 
are assumed to be zero-mean. 


A square matrix K is a covariance matrix if a (real or complex) rv Z exists such that K = 
E[ZZ"*]. The complex conjugate of the transpose, Z™, is called the Hermitian transpose and 
denoted by Z'. If Z is real, of course, Z' = Z™. Similarly, for a matrix K, the Hermitian 
conjugate, denoted K', is K™*. A matrix is Hermitian if K = K'. Thus a real Hermitian matrix 
(a Hermitian matrix containing all real terms) is a symmetric matrix. 


An n by n square matrix K with real or complex terms is nonnegative definite if it is Hermitian 
and if, for all b € C", b'Kb is real and nonnegative. It is positive definite if, in addition, 
biKb > 0 for b 4 0. We now list some of the important relationships between nonnegative 
definite, positive definite, and covariance matrices and state some other useful properties of 
covariance matrices. 


1. Every covariance matrix K is nonnegative definite. To see this, let Z be a rv such that 
K =E[ZZ']. K is Hermitian since E[Z,Z*,] = E[Z*,Z,] for all k,m. For any b € C®, let 
X = b'Z. Then 0 < E[|X|?] = E [(b'Z)(b1Z)*] = E[b'ZZ1b] = biKb. 
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2. For any complex n by n matrix A, the matrix K = AAT is a covariance matrix. In fact, let 
Z have n independent unit-variance elements so that Kz is the identity matrix I,. Then 
Y = AZ has the covariance matrix Ky = E[((AZ)(AZ)"] = E[|AZZtAt] = AAT. Note that 
if A is real and Z is real, then Y is real and, of course, Ky is real. It is also possible for 
A to be real and Z complex, and in this case Ky is still real but Y is complex. 


3. A covariance matrix K is positive definite if and only if K is nonsingular. To see this, let 
K = E[ZZ'] and note that if b}Kb = 0 for some b 4 0, then X = btZ has zero variance, 
and therefore is zero with probability 1. Thus E[X Z"] = 0, so b'E[ZZ"] = 0. Since b 4 0 
and b'K = 0, K must be singular. Conversely, if K is singular, there is some b such that 
Kb = 0, so b'K8 is also 0. 


4. A complex number 4 is an eigenvalue of a square matrix K if Kq = Aq for some nonzero 

vector g; the corresponding q is an eigenvector of K. The following results about the 
eigenvalues and eigenvectors of positive (nonnegative) definite matrices K are standard 
linear algebra results (see for example, Strang, section 5.5): 
All eigenvalues of K are positive (nonnegative). If K is real, the eigenvectors can be taken to 
be real. Eigenvectors of different eigenvalues are orthogonal, and the eigenvectors of any one 
eigenvalue form a subspace whose dimension is called the multiplicity of that eigenvalue. If 
K is n by n, then n orthonormal eigenvectors, q1,... , @n can be chosen. The corresponding 
list of eigenvalues, A1,... ,An need not be distinct; specifically, the number of repetitions 
of each eigenvalue equals the multiplicity of that eigenvalue. Finally det(K) = [[j_, Ax- 

5. If K is nonnegative definite, let Q be the matrix with the orthonormal columns, qj,... , Qn 
defined above. Then Q satisfies KQ = QA where A = diag(A1,... , An). This is simply the 
vector version of the eigenvector /eigenvalue relationship above. Since qh Gm = Onm, Q also 
satisfies Q'Q = I,. We then also have Q~! = Qi and thus QQ? = In; this says that the 
rows of Q are also orthonormal. Finally, by post-multiplying KQ = QA by Q', we see that 
K = QAQ’. The matrix Q is called unitary if complex, and orthogonal if real. 


6. If K is positive definite, then Kb # 0 for b £0. Thus K can have no zero eigenvalues and 
A is nonsingular. It follows that K can be inverted as K~! = QA~!Q?. For any n-vector b, 


b1K-1b = So ALT |(B, ae). 
& 


To see this, note that b‘'K~'b = BIQA—!Q'b. Letting v = Q’b and using the fact that the 
rows of Q' are the orthonormal vectors qx, we see that (b, q,) is the kth component of v. 
We then have v'A~ty = >, Nz lvel?, which is equivalent to the desired result. Note that 
(b, qx) is the projection of b in the direction of qx. 


7. detK = [[p_, Ax where \1,...,An are the eigenvalues of K repeated according to their 
multiplicity. Thus if K is positive definite, detK > 0 and if K is nonnegative definite, 
det K > 0. 


8. If K is a positive definite (semi-definite) matrix, then there is a unique positive definite 
(semi-definite) square root matrix R satisfying R* = K. In particular, R is given by 


R = QA’/?Qt where Al/? = diag (Vx, A DSi 8 Vn) (7.88) 


9. If K is nonnegative definite, then K is a covariance matrix. In particular, K is the covariance 
matrix of Y = RV where R is the square root matrix in (7.88) and Ky = Im 
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This shows that zero-mean jointly-Gaussian rv’s exist with any desired covariance matrix; 
the definition of jointly Gaussian here as a linear combination of normal rv’s does not limit 
the possible set of covariance matrices. 


For any given covariance matrix K, there are usually many choices for A satisfying K = AAT. 
The square root matrix R above is simply a convenient choice. Some of the results in this section 
are summarized in the following theorem: 


Theorem 7A.1. An n by n matrix K is a covariance matrix if and only if it is nonnegative 
definite. Also it is a covariance matrix if and only if K = AAT for an n by n matrix A. One 
choice for A is the square root matrix R in (7.88). 


7A.2 The Fourier series expansion of a truncated random process 


Consider a (real zero-mean) random process that is effectively WSS over some interval [-2, 40) 
where To is viewed intuitively as being very large. Let {Z(t);|t| < Ay be this process trun- 
cated to the interval [-2, 7b), The objective of this and the next appendix is to view this 
truncated process in the frequency domain and discover its relation to the spectral density of 
an untruncated WSS process. A second objective is to interpret the statistical independence 
between different frequencies for stationary Gaussian processes in terms of a truncated process. 


Initially assume that {Z(t); |t| < fo} is arbitrary; the effective WSS assumption will be added 
later. Assume the sample functions of the truncated process are £2 real functions with prob- 
ability 1. Each £2 sample function, say {Z(t,);|t]| < fo} can then be expanded in a Fourier 
series, 
= Ti 
Z(t,w)= So Zp(we?™*/, fe < on (7.89) 


mMm=— CO 


The orthogonal functions here are complex and the coefficients Z;,(w) can be similarly complex. 
Since the sample functions {Z(t,w); |t| < fo} are real, Z,(w) = Z*,(w) for each k. This also 
implies that Zo(w) is real. The inverse Fourier series is given by 


To 


gl) = | Z(tyu)e-2"iR/D at, (7.90) 
To — 70 


For each sample point w, Z, (w) isa complex number, so Zp, is a complex random variable, #.e., 
R(Z,) and S(Z,) are each rv’s. Also, R(Z,) = R(Z_~) and S(Z,) = —S(Z_x) for each k. It 
follows that the truncated process {Z(t); |t| < fo} defined by 


To 


ae Qikt/To To 
Z(th= So Ze 2 Sete 


= (7.91) 


k=—0o 


is a (real) random process and the complex random variables Z,, are complex linear functionals 
of Z(t) given by 


To 
i 1 2 : 
ae Z(tye727*kt/To ae. 7.92 
k= fom (tye (7.92) 
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Thus (7.91) and (7.92) are a Fourier series pair between a random process and a sequence of 
complex rv’s. The sample functions satisfy 


To 
ay ee = ‘ 2 
Th Bee (t, w) dt = S~|Z(w))| , 
3 keZ 
so that 
1 To 
2: A 
Ba = Z*(t) dt| = E||Z,|?|. 7.93 
; = al Ye 2" (7.98) 


The assumption that the sample functions are £2 with probability 1 can be seen to be equivalent 
to the assumption that 


S"Sp<0o where S_ = E[|Z|7]. (7.94) 
keZ 


This is summarized in the following theorem. 


Theorem 7A.2. If a zero-mean (real) random process is truncated to [-2, 70) and the trun- 
cated sample functions are Lo with probability 1, then the truncated process is specified by the 
joint distribution of the complex Fourier-coefficient random variables {Fir Furthermore, any 
joint distribution of {Z_;k € Z} that satisfies (7.94) specifies such a truncated process. 


The covariance function of a truncated process can be calculated from (7.91) as follows: 
Kar) SEZ OZ Gat Se Zee OS. Ze cere 
k m 


a ay Ty Ty 
_ ne EZ ge er ie eee for — - Bo hy Be (7.95) 


km 


Note that if the function on the right of (7.95) is extended over all t,7 € R, it becomes periodic 
in t with period Jo for each 7, and periodic in 7 with period 7p for each t. 


Theorem 7A.2 suggests that virtually any truncated process can be represented as a Fourier 
series. Such a representation becomes far more insightful and useful, however, if the Fourier 
coefficients are uncorrelated. The next two subsections look at this case and then specialize to 
Gaussian processes, where uncorrelated implies independent. 


7A.3. Uncorrelated coefficients in a Fourier series 


Consider the covariance function in (7.95) under the additional assumption that the Fourier 

coefficients {Z,;k € Z} are uncorrelated, i.e., that E[Z,Z*,] = 0 for all k,m such that k 4 m. 

This assumption also holds for m = —k, and, since Z, = Z*, for all k, implies both that 

E[(®(Zz))?] = E[(S(Zz))?] and E[R(Z,)S(Z,)] = 0 (see Exercise 7.10). Since E[Z,Z*,] = 0 for 
k#™m, (7.95) simplifies to 

Kz(t,7T) = Speer 2)o, for — fo <tr< fo 7.96 

air) = 5 ere (7.96) 
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This says that Kz(t,7) is a function only of t—r over —% <t7< ee i.e., that Kz(t,7) is 
effectively WSS over [42, 42]}. Thus Kz(t,7) can be denoted as Kz(t—T) in this region, and 


Kgs) se (7.97) 
k 


This means that the variances 5; of the sinusoids making up this process are the Fourier series 
coefficients of the covariance function Kz(r). 


In summary, the assumption that a truncated (real) random process has uncorrelated Fourier 
series coefficients over [-2, 79) implies that the process is WSS over [-2, 70) and that the 
variances of those coefficients are the Fourier coefficients of the single variable covariance. This is 
intuitively plausible since the sine and cosine components of each of the corresponding sinusoids 
are uncorrelated and have equal variance. 

Note that Kz(t,7) in the above example is defined for all t,7 € [-2, Fo) and thus t—r ranges 
from —Tp to Ty and Kz(r) must satisfy (7.97) for —Tyo < r < Ty. From (7.97), Kz(r) is also 
periodic with period To, so the interval [—To, To] constitutes 2 periods of Kz(r) . This means, 
for example, that E[Z(—¢)Z*(e)| = E[Z(2-2)Z*(-B+e)]. More generally, the periodicity of 
Kz(r) is reflected in Kz(t,7) as illustrated in figure 7.6. 


To 
2 
z 

t= Lines of equal Kz(t,7) 
=~ Lines of equal Kz(t,7) 
2 ih t To 


2 2 
Figure 7.6: Constraint on Kz(t,7) imposed by periodicity of Kz(t—7). 


We have seen that essentially any random process, when truncated to [- 2, 70), has a Fourier 
series representation, and that if the Fourier series coefficients are uncorrelated, then the trun- 
cated process is WSS over [-2, 0) and has a covariance function which is periodic with period 


To. This proves the first half of the following theorem: 


Theorem 7A.3. Let {Z(t); te[-2, fo} be a finite-energy zero-mean (real) random process 


over [-2, 70) and let {Z,; kEZ} be the Fourier series rv’s of (7.91) and (7.92). 


e If ElZ,.Z7,] = Sko~,m for all k,m € Z, then {Z(t);t € [-2, To)y is effectively WSS within 
[- 2, 40) and satisfies (7.97). 

e If {Z(t); te[—2, Fy} is effectively WSS within [-2, 7a) and if Kz(t—r) is periodic with 
period Ty over [—To, To], then E[Z,Z7,] = Sp0%,m for some choice of S;, > 0 and for all 
k,meZ. 


Proof: To prove the second part of the theorem, note from (7.92) that 


To To 
ZF Ox 1 2 2 —277 TUMT 
EA Zul = 72 / : , © Kale re ore Togerinr Tt dr. (7.98) 
a ri 
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By assumption, Kz(t,7) = Kz(t—r) for t,7 € [— io , 2] and Kz(t — 7) is periodic with period 
To. Substituting s = t—7 for t as a variable of ienaton (7.98) becomes 


To 


70 2, 
E[(Z,Z*,] = ae ae Near amie to i) e 2rikr/To 62nimr/To an (7.99) 


The integration over s does not depend on 7 because the interval of integration is one period 
and Kz is periodic. Thus this integral is only a function of k, which we denote by To.S;. Thus 


. ~2ni(k—m)r/To S, form=k 
EZ 2m -Efa He a ={ 0 otherwise e100) 


This shows that the Z;, are uncorrelated, completing the proof. 


The next issue is to find the relationship between these processes and processes that are WSS 
over all time. This can be done most cleanly for the case of Gaussian processes. Consider a WSS 
(and therefore stationary) zero-mean Gaussian random process?° {Z'(t);t € R} with covariance 
function Kz/(r) and assume a limited region of nonzero covariance i.e., 


Ty 


Kz(r) =0 for |r| > — 5" 


Let Sz(f) > 0 be the spectral density of Z’ and let Tp satisfy Ty > T. The Fourier series coeffi- 


cients of Kz:(r) over the interval [— a, fo) are then given by S, = Sait. Suppose this process 
is approximated over the interval [-2, To) by a truncated Gaussian process {Z(t); t€[— 4b , 2} 


composed of independent Fourier coefficients Z,, i.e. 


as T Ta 
= Qrikt/Ti _ 
Sy Zen", oS 


where 
E[(Z.Z*,] = Srdkm for all k,m € Z. 


By Theorem 7A.3, the So eas function of Z(t) is Kz(T) = do, Spe?™*/T_ This is periodic 
with period Ty and for |r| < 2, Kz(r) = Kz(r). The original process Z’/(t) and the approx- 
: 4. For |r| > 2, Kz-(r) = 0 whereas 
Kz(rT) is periodic over all 5 Also, of course, Z’ is stationary, whereas Z is effectively stationary 
within its domain [—-2, 2]. The difference between Z and Z’ becomes more clear in terms of 


the two-variable covariance function, illustrated in Figure 7.7. 


imation Z(t) thus have the same covariance for |rT| < 


It is evident from the figure that if Z’ is modeled as a Fourier series over [-2, To) using 
independent complex circularly symmetric Gaussian coefficients, then Kz/(t,7) = Kz(t,7) for 
|\t|,|7| < fot . Since zero-mean Gaussian processes are completely specified by their covariance 


functions, this means that Z’ and Z are statistically identical over this interval. 


In summary, a stationary Gaussian process Z’ can not be perfectly modeled over an interval 
[-2, 40) by using a Fourier series over that interval. The anomalous behavior is avoided, 


25Equivalently, one can assume that Z’ is effectively WSS over some interval much larger than the intervals of 
interest here. 
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a tp t fo 21 t To 
a (b) 
Figure 7.7: Part (a) illustrates Kz/(t,7) over the region —2 < t,7 < @ for a stationary 
process Z’ satisfying Kz(r) = 0 for |r| > T,/2. Part (b) illustrates the approximating process 
Z comprised of independent sinusoids, spaced by 1/To and with uniformly distribuited phase. 
Note that the covariance functions are identical except for the anomalous behavior at the 
corners where t is close to Tp/2 and 7 is close to —To/2 or vice versa. 


however, by using a Fourier series over a larger interval, large enough to include the interval of 
interest plus the interval over which Kz/(r) 4 0. If this latter interval is unbounded, then the 
Fourier series model can only be used as an approximation. The following theorem has been 
established. 


Theorem 7A.4. Let Z'(t) be a zero-mean stationary Gaussian random process with spectral 
density S(f) and covariance Kz(r) = 0 for |r| > 71/2. Then for Ty > Ti, the truncated process 

Z(t) =, Zee2™*t/7 for |t| < 2, where the Z, are independent and Zy, ~ CN 2G) ELY ) for all 
k €Z is statistically identical to Z'(t) over [-2>4 5, ahs FoF), 


The above theorem is primarily of conceptual use, rather than as a problem solving tool. It shows 
that, aside from the anomalous behavior discussed above, stationarity can be used over the region 
of interest without concern for how the process behaves outside far outside the interval of interest. 
Also, since Tg can be arbitrarily large, and thus the sinusoids arbitrarily closely spaced, we see 
that the relationship between stationarity of a Gaussian process and independence of frequency 
bands is quite robust and more than something valid only in a limiting sense. 


7A.4. The Karhunen-Loeve expansion 


There is another approach, called the Reece expansion for representing a random 


process that is truncated to some interval [— fy F =~] by an orthonormal expansion. The objec- 
tive is to choose a set of orthonormal functions such that the coefficients in the expansion are 
uncorrelated. 


We start with the covariance function K(t,7) defined for t,7 € [— tb , 2]. The basic facts about 
these time-limited covariance functions are virtually the same as the facts about covariance 
matrices in Appendix 7A.1. K(t,7) is nonnegative definite in the sense that for all £2 functions 


g(t), ny 
i [ys (t)Kz(t,r)g(r) dt dr > 0 


To To 
2° 


Kz also has real valued orthonormal eigenvectors defined over [—=}, =~] and nonnegative eigen- 
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values. That is 
To 


I Kz(t,7)bm(T) dt =Amom(t); t€ |-2 3] where (Pn, Py) = Sm,k 


These eigenvectors span the £2 space of real functions over [-2, fo), By using these eigenvectors 


as the orthonormal functions of Z(t) = )°,,, Zm¢m/(t), it is easy to show that E[ZmZz] = Amomk- 
In other words, given an arbitrary covariance function over the truncated interval [-2, 2], we 
can find a particular set of orthonormal functions so that Z(t) = 50, Zmom(t) and E[ZmZz] = 
AmOm,k- This is called the Karhunen-Loeve expansion. 


These equations for the eigenvectors and eigenvalues are well-known integral equations and can 
be calculated by computer. Unfortunately they do not provide a great deal of insight into the 
frequency domain. 
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7.E Exercises 


7.1. (a) Let X, Y be iid rv’s, each with density f, (x) = a exp(—x?/2). In part (b), we show 
that a must be 1/27 in order for f, (x) to integrate to 1, but in this part, we leave a 
undetermined. Let S = X?+ Y?. Find the probability density of S in terms of a. Hint: 
Sketch the contours of equal probability density in the X,Y plane. 

(b) Prove from part (a) that a must be 1/27 in order for S, and thus X and Y, to be 
random variables. Show that E[X] = 0 and that E[X?] = 1. 


(c) Find the probability density of R= VS. R is called a Rayleigh rv. 


7.2. (a) Let X ~ N(0,0%) and Y ~ N(0,0%-) be independent zero-mean Gaussian rv’s. By 
convolving their densities, find the density of X-+Y. Hint: In performing the integration for 
the convolution, you should do something called “completing the square” in the exponent. 
This involves multiplying and dividing by eov"/2 for some a, and you can be guided in this 
by knowing what the answer is. This technique is invaluable in working with Gaussian rv’s. 
(b) The Fourier transform of a probability density fx(x) is fx(0) = Jie "de 
Efe?" By scaling the basic Gaussian transform in (4.48), show that for X ~ N(0,0%), 


fx (0) = exp 4 


(b) Now find the density of X + Y by using Fourier transforms of the densities. 


(c) Using the same Fourier transform technique, find the density of V = $77, a,W; where 
W,,...,W, are independent normal rv’s. 


7.3. In this exercise you will construct two rv’s that are individually Gaussian but not jointly 
Gaussian. Consider the nonnegative random variable X with the density 


fx(e) = ‘E exp (=) for x > 0. 


Let U be binary, +1, with py(1) = py(—1) = 1/2. 
(a) Find the probability density of Y, = UX. Sketch the density of Y; and find its mean 
and variance. 


(b) Describe two normalized Gaussian rv’s, say Y; and Y2, such that the joint density of 
Y1, Yo is zero in the second and fourth quadrants of the plane. It is nonzero in the first 
and third quadrants where it has the density 1 exp( Mi), Hint: Use part (a) for Y; and 
think about how to construct Yo. 

(c) Find the covariance E[Y| Y2]. Hint: First find the mean of the rv X above. 

(d) Use a variation of the same idea to construct two normalized Gaussian rv’s Vj, V2 


whose probability is concentrated on the diagonal axes vy = vg and vy = —ve, 7.e., for 
which Pr(V, 4 V2 and Vy 4 —V2) = 0. 

7.4. Let Wy ~ N(0,1) and W2 ~ N(0,1) be independent normal rv’s. Let X = max(Wj, W2) 
and Y = min(Wj, W2). 
(a) Sketch the transformation from sample values of W1,W 2 to sample values of X,Y. 
Which sample pairs w 1, w2 of W1, W2 map into a given sample pair x,y of X,Y? 
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(b) Find the probability density fxy(x,y) of X,Y. Explain your argument briefly but 
work from your sketch rather than equations. 


(c) Find fs(s) where S=X+Y. 
(d) Find fp(d) where D= X -Y. 
(e) Let U be a random variable taking the values +1 with probability 1/2 each and let U 
be statistically independent of W,,W2. Are S and UD jointly Gaussian? 
7.5. Let $(t) be an £2 function of energy 1 and let h(t) be £2. Show that f°. d(t)h(r — t) dt 


is an Lo function of 7 with energy upper bounded by |jA||?. Hint: Consider the Fourier 
transform of g(t) and A(t). 


7.6. (a) Generalize the random process of (7.30) by assuming that the Z;, are arbitrarily corre- 
lated. Show that every sample function is still Lo. 


(b) For this same case, show that [f |Kz(t,7)|? dtdr < co. 


7.7. (a) Let 2, Z2,..., be a sequence of independent Gaussian rv’s, Z, ~ N (0,02) and let 
{¢,(t) : R — R} be a sequence of orthonormal functions. Argue from fundamental def- 
initions that for each t, Z(t) = )°/_, Zeox(t) is a Gaussian random variable. Find the 
variance of Z(t) as a function of t. 

(b) For any set of epochs, t1,... ,t¢, let Z(tm) = doy¢—1 Zebu(tm) for 1 < m < &. Explain 
carefully from the basic definitions why {Z(t1),... , Z(t¢)} are jointly Gaussian and specify 
their covariance matrix. Explain why {Z(t);t € R} is a Gaussian random process. 

(c) Now let n = co above and assume that }*,, 0? < oo. Also assume that the orthonormal 
functions are bounded for all k and t in the sense that for some constant A, |@,(t)| < A for 
all k and t. Consider the linear combination of rv’s 


Z(t) = So Zeox(t) = Jim, S> ZrGx(t) 
k 


k=1 


Let Z(t) = Y7?_, Zebp(t). For any given t, find the variance of Z(t) — Z(t) for 
j >n. Show that for all 7 > n, this variance approaches 0 as n — oo. Explain intuitively 
why this indicates that Z(t) is a Gaussian rv. Note: Z(t) is in fact a Gaussian rv, but 
proving this rigorously requires considerable background. Z(t) is a limit of a sequence of 
rv’s, and each rv is a function of a sample space - the issue here is the same as that of a 
sequence of functions going to a limit function, where we had to invoke the Riesz-Fischer 
theorem. 


(d) For the above Gaussian random process {Z(t);¢ € R}, let z(t) be a sample function of 
Z(t) and find its energy, i.e., ||z||? in terms of the sample values 21, 22,... of Z1, Zo,.... 
Find the expected energy in the process, E[||{Z(t);t € R}||?]. 

(ec) Find an upper bound on Pr{||{Z(t);t € R}||? > a} that goes to zero as a@ — oo. 
Hint: You might find the Markov inequality useful. This says that for a nonnegative rv 
VY. Pry > at key Explain why this shows that the sample functions of {Z(t)} are Lo 
with probability 1. 


7.8. Consider a stochastic process {Z(t);¢ € R} for which each sample function is a sequence of 
rectangular pulses as in the figure below. 
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Z—1 22 


20 | 71 


Analytically, Z(t) = So?°._,, Zprect(t — k) where ...Z_1, Zo, Z1,... is a sequence of iid 
normal variables, Z, ~ N(0,1).. 

(a) Is {Z(t); t € R} a Gaussian random process? Explain why or why not carefully. 

(b) Find the covariance function of {Z(t); t € R}. 

(c) Is {Z(t); t € R} a stationary random process? Explain carefully. 

(d) Now suppose the stochastic process is modified by introducing a random time shift ® 
which is uniformly distributed between 0 and 1. Thus, the new process, {V(t); t € R} is 
defined by V(t) = So... Zerect(t — k — ®). Find the conditional distribution of V(0.5) 
conditional on V(0) = v. 

(e) Is {V(t); t € R} a Gaussian random process? Explain why or why not carefully. 

(f) Find the covariance function of {V(t); t € R}. 

(g) Is {V(t); t € R} a stationary random process? It is easier to explain this than to write 
a lot of equations. 


7.9. Consider the Gaussian sinc process, V(t) = }°;, Vz sinc (SRT) where {... , V_1, Vo, Vi,--- 5} 
is a sequence of iid rv’s, Vy ~ N(0, 07). 
(a) Find the probability density for the linear functional [ V(t)sinc(4) dt. 
(b) Find the probability density for the linear functional f V(t)sinc($) dt for a > 1. 
(c) Consider a linear filter with impulse response h(t) = sinc($+) where a > 1. Let {Y(t)} 
be the output of this filter when V(t) above is the input. Find the covariance function of 
the process {Y(t)}. Explain why the process is Gaussian and = it is mene 
(d) Find the probability density for the linear functional Y(r) = f V(t)sine( eo) dt for 
a > 1 and arbitrary rT. 
(e) Find the spectral density of {Y (t);t € R}. 
(f) Show that {Y(t);t € R} can be represented as Y(t) = 5°, Y,sine (H5*) and characterize 
the rv’s {¥;;k € Z}. 
(g) Repeat parts (c), (d), and (e) for a < 1. 
(h) Show that {Y(¢)} in the a < 1 case can be represented as a Gaussian sinc process (like 
{V(t)} but with an appropriately modified value of T). 


(i) Show that if any given process {Z(t);t € R} is stationary, then so is the process {Y (t);t € 
R} where Y(t) = Z7(t) for all t ER. 


7.10. (Complex random variables)(a) Suppose the zero-mean complex random variables X, 
and X_, satisfy X*, = Xz, for all k. Show that if E[X,X*,] = 0 then E[(R(X;))?] = 
E[(S(Xq))7] and E[R(X_)S(X_,)] = 0. 

(b) Use this to show that if E[X;,X*,] = 0 then E[R(X;,)R(Xm)] = 0, E[R(X,)S(Xm)] = 0, 
and E[S(X;,)S(Xm)] = 0 for all m not equal to either k or —k. 

7.11. Explain why the integral in (7.58) must be real for gi(t) and g(t) real, but the integrand 

oi(f)Sz(f)95(f) need not be real. 
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7.12. (Filtered white noise) Let {Z(t)} be a White Gaussian noise process of spectral density 
No/2. 
(a) Let Y = ri Z(t) dt. Find the probability density of Y. 
(b) Let Y(t) be the result of passing Z(t) through an ideal baseband filter of bandwidth 
W whose gain is adjusted so that its impulse response has unit energy. Find the joint 
distribution of Y(0) and Y(qy). 
(c) Find the probability density of 


ve i eZ (t) dt. 
0 


7.13. (Power spectral density) (a) Let {¢,(t)} be any set of real orthonormal £2 waveforms whose 
transforms are limited to a band B, and let {W(t)} be white Gaussian noise with respect 
to B with power spectral density Sw(f) = No/2 for f € B. Let the orthonormal expansion 
of W(t) with respect to the set {;(t)} be defined by 


W(t) = >° Wide (t), 
k 


where Wy, = (W(t), x (t)). Show that {W;} is an iid Gaussian sequence, and give the 
probability distribution of each Wz. 


(b) Let the band B be B = [-1/2T,1/2T], and let ¢,(t) = (1/VT)sinc(4#*),k € Z. 
Interpret the result of part (a) in this case. 


7.14. (Complex Gaussian vectors) (a) Give an example of a 2 dimensional complex rv Z = 
(Z1,Z2) where Z, ~ CN(0,1) for k = 1,2 and where Z has the same joint probability 
distribution as e’?Z for all ¢ € [0, 27] but where Z is not jointly Gaussian and thus not 
circularly symmetric. Hint: Extend the idea in part (d) of Exercise 7.3. 

(b) Suppose a complex random variable Z = Z,_. + 1Zim has the properties that Z,. and 
Zim are individually Gaussian and that Z has the same probability density as e’?Z for all 
@ € [0, 27]. Show that Z is complex circularly symmetric Gaussian. 
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Chapter 8 


Detection, coding, and decoding 


8.1 Introduction 


The previous chapter showed how to characterize noise as a random process and this chapter 
uses that characterization to retrieve the signal from the noise corrupted received waveform. 
As one might guess, this is not possible without occasional errors when the noise is unusually 
large. The objective then, is to retrieve the data while minimizing the effect of these errors. 
This process of retrieving data from a noise corrupted version is known as detection. 


Detection, decision making, hypothesis testing, and decoding are synonyms. The word detection 
refers to the effort to detect whether some phenomenon is present or not on the basis of obser- 
vations. For example, a radar system uses the observations to detect whether or not a target is 
present; a quality control system attempts to detect whether a unit is defective; a medical test 
detects whether a given disease is present. The meaning of detection has been extended in the 
digital communication field from a yes/no decision to a decision at the receiver from a finite set 
of possible transmitted signals. Such a decision from a set of possible transmitted signals is also 
called decoding, but here the possible set is usually regarded as the codewords in a code rather 
than the signals in a signal set.! Decision making is, again, the process of deciding between a 
number of mutually exclusive alternatives. Hypothesis testing is the same, and here the mutually 
exclusive alternatives are called hypotheses. We use the word hypotheses for the possible choices 
in what follows, since the word conjures up the appropriate intuitive image of making a choice 
between a set of alternatives, where only one alternative is correct and there is a possibility of 
erroneous choice. 


These problems will be studied initially in a purely probabilistic setting. That is, there is a 
probability model within which each hypothesis is an event. These events are mutually exclusive 
and collectively exhaustive, i.e., the sample outcome of the experiment lies in one and only one 
of these events, which means that in each performance of the experiment, one and only one 
hypothesis is correct. Assume there are M hypotheses”, labeled ag,... ,a,¢_1. The sample 
outcome of the experiment will lie in one of these M events. This defines a random symbol U 


' As explained more fully later, there is no fundamental difference between a code and a signal set. 

?The principles here apply essentially without change for a countably infinite set. of hypotheses; for an uncount- 
ably infinite set of hypotheses, the process of choosing an hypothesis from an observation is called estimation. 
Typically, the probability of choosing correctly in this case is 0 and the emphasis is on making an estimate that 
is close in some sense to the correct hypothesis. 
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which, for each m, takes the sample value a,;, when event a, occurs. The marginal probability 
Py (Gm) of hypothesis am is denoted pm and is usually referred to as the a priori probability of 
Gm. There is also a random variable (rv) V, called the observation. This is the data on which the 
decision must be based. A sample value v of V is observed, and on the basis of that observation, 
the detector selects one of the possible M hypotheses. The observation could equally well be a 
complex random variable, a random vector, a random process, or a random symbol, and these 
generalizations are discussed in what follows. 


Before discussing how to make decisions, it is important to understand when and why decisions 
must be made. As a binary example, assume that the conditional probability of hypothesis ag, 
given the observation, is 2/3 and that of hypothesis a; is 1/3. Simply deciding on hypothesis ag 
and forgetting about the probabilities throws away the information about the probability that 
the decision is correct. However, actual decisions sometimes must be made. In a communication 
system, the user usually wants to receive the message (even partly garbled) rather than a set of 
probabilities. In a control system, the controls must occasionally take action. Similarly managers 
must occasionally choose between courses of action, between products, and between people to 
hire. In a sense, it is by making decisions that we return from the world of mathematical 
probability models to the world being modeled. 


There are a number of possible criteria to use in making decisions. Initially assume that the 
criterion is to maximize the probability of correct choice. That is, when the experiment is 
performed, the resulting experimental outcome maps into both a sample value a,,, for U and a 
sample value v for V. The decision maker observes v (but not a,,) and maps v into a decision 
u(v). The decision is correct if U(v) = am. In principal, maximizing the probability of correct 
choice is almost trivially simple. Given v, calculate Puy (am |v) for each possible hypothesis a,. 
This is the probability that a, is the correct hypothesis conditional on v. Thus the rule for 
maximizing the probability of being correct is to choose u(v) to be that a, for which p,,,,(am | v) 
is maximized. For each possible observation v, this is denoted 


ai(v) = arg max[P,y(4m | v)] (MAP rule), (8.1) 


where arg max,, means the argument m that maximizes the function. If the maximum is not 
unique, the probability of being correct is the same no matter which maximizing m is chosen, so 
to be explicit, the smallest such m will be chosen.? Since the rule (8.1) applies to each possible 
sample output v of the random variable V, (8.1) also defines the selected hypothesis as a random 
symbol U (V). The conditional probability Puy 18 called an a posteriori probability. This is in 
contrast to the a priori probability p,, of the hypothesis before the observation of V. The decision 
rule in (8.1) is thus called the maximum a posteriori probability (MAP) rule. 


An important consequence of (8.1) is that the MAP rule depends only on the conditional prob- 
ability Day and thus is completely determined by the joint distribution of U and V. Everything 
else in the probability space is irrelevant to making a MAP decision. 


When distinguishing between different decision rules, the MAP decision rule in (8.1) is denoted 
as Uy;,p(v). Since the MAP rule maximizes the probability of correct decision for each sample 
value v, it also maximizes the probability of correct decision averaged over all v. To see this 


3 As discussed in the appendix, it is sometimes desirable to choose randomly among the maximum aposteriori 
choices when the maximum in (8.1) is not unique. There are often situations (such as with discrete coding and 
decoding) where non-uniqueness occurs with positive probability. 
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analytically, let (uv) be an arbitrary decision rule. Since u,,,, maximizes pyyy(m|v)] over m, 
Pyy(tar (v) | ¥) — Py y(t (v) | v) = 0; for each rule D and observation v. (8.2) 


Taking the expected value of the first term on the left over the observation V, we get the 
probability of correct decision using the MAP decision rule. The expected value of the second 
term on the left, for any given D is the probability of correct decision using that rule. Thus, 
taking the expected value of (8.2) over V shows that the MAP rule maximizes the probability 
of correct decision over the observation space. The above results are very simple, but also 
important and fundamental. They are summarized in the following theorem. 


Theorem 8.1.1. The MAP rule, given in (8.1), maximizes the probability of correct decision, 
both for each observed sample value v and as an average over V. The MAP rule is determined 
solely by the joint distribution of U and V. 


Before discussing the implications and use of the MAP rule, the above assumptions are reviewed. 
First, a probability model was assumed in which all probabilities are known, and in which, for 
each performance of the experiment, one and only one hypothesis is correct. This conforms very 
well to the communication model in which a transmitter sends one of a set of possible signals, 
and the receiver, given signal plus noise, makes a decision on the signal actually sent. It does not 
always conform well to a scientific experiment attempting to verify the existence of some new 
phenomenon; in such situations, there is often no sensible way to model a priori probabilities. 
Detection in the absence of known a priori probabilities is discussed in the appendix. 


The next assumption was that maximizing the probability of correct decision is an appropriate 
decision criterion. In many situations, the cost of a wrong decision is highly asymmetric. For 
example, when testing for a treatable but deadly disease, making an error when the disease is 
present is far more costly than making an error when the disease is not present. As shown in 
Exercise 8.1, it is easy to extend the theory to account for relative costs of errors. 


With the present assumptions, the detection problem can be stated concisely in the following 
probabilistic terms. There is an underlying sample space (2, a probability measure, and two rv’s 
U and V of interest. The corresponding experiment is performed, an observer sees the sample 
value v of rv V, but does not observe anything else, particularly not the sample value of U, say 
dm. The observer uses a detection rule, u(v), which is a function mapping each possible value 
of v to a possible value, ag to ayy-1, of U. If 0(v) = am, the detection is correct, and otherwise 
an error has been made. The above MAP rule maximizes the probability of correct detection 
conditional on each v and also maximizes the unconditional probability of correct detection. 
Obviously, the observer must know the conditional probability assignment Pyjy i order to use 
the MAP rule. 


The next two sections are restricted to the case of binary hypotheses, (IV = 2). This allows us 
to understand most of the important ideas but simplifies the notation considerably. This is then 
generalized to an arbitrary number of hypotheses; fortunately this extension is almost trivial. 


8.2 Binary detection 


Assume a probability model in which the correct hypothesis U is a binary random variable with 
possible values {ao,a1} and a priori probabilities po and p;. In the communication context, 
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the a priori probabilities are usually modeled as equiprobable, but occasionally there are multi- 
stage detection processes in which the result of the first stage leads to non-equiprobable a priori 
probabilities in subsequent stages. Thus let pp and p,; = 1 — po be arbitrary. Let V be a 
rv with a conditional probability density f,,.,,(v|@m) that is finite and non-zero for all v € R 
and m € {0, 1}. The modifications for zero densities, discrete V, complex V, or vector V are 
relatively straight-forward and discussed later. 


The conditional densities Tale |@m), m € {0,1} are called likelihoods in the jargon of hypothesis 
testing. The marginal density of V is given by f,(v) = Pofyy(vlao) + Pifyy(vla). The a 
posteriori probability of U, for m = 0 or 1, is given by 


Pmfyjy (v | Am) 


Pp am | VU) = 8.3 
aa | ) fs (v) ( ) 
Writing out (8.1) explicitly for this case, 

Pot yy (v | a0) >U=n0 Pifviy lar) (8.4) 


fi) <éay AV (2) 


This “equation” indicates that the MAP decision is ao if the left side is greater than or equal 
to the right, and is a if the left side is less than the right. Choosing the decision U = ag when 
equality holds in (8.4) is an arbitrary choice and does not affect the probability of being correct. 
Canceling f,,(v) and rearranging, 


fy (lao) >t=0 py _ 


A(v) = aCe 


(8.5) 


A(v) = fy (v| ao)/fyyy (| a1) is called the likelihood ratio, and is a function only of v. The 
ratio 7 = pi/po is called the threshold and depends only on the a priori probabilities. The 
binary MAP rule (or MAP test, as it is usually called) then compares the likelihood ratio to 
the threshold, and decides on hypothesis ag if the threshold is reached, and on hypothesis a, 
otherwise. Note that if the a priori probability po is increased, the threshold decreases, and 
the set of v for which hypothesis ag is chosen increases; this corresponds to our intuition—the 
more certain we are initially that U is 0, the stronger the evidence required to make us change 
our minds. As shown in Exercise 8.1, the only effect of minimizing over costs rather than error 
probability is to change the threshold 7 in (8.5). 


An important special case of (8.5) is that in which po = p,. In this case 7 = 1, and the rule 
chooses U(v) = ag for fy (lao) = fy (var) and chooses U(v) = 1 otherwise. This is called 
a maximum likelihood (ML) rule or test. In the communication case, as mentioned above, the 
a priori probabilities are usually equal, so MAP then reduces to ML. The maximum likelihood 
test is also often used when po and p; are unknown. 


The probability of error, i.e., one minus the probability of choosing correctly, is now derived 
for MAP detection. First we find the probability of error conditional on each hypothesis, 
Pr{e|U=a,} and Pr{e|U=ao}. The overall probability of error is then given by 


Pr{e} = po Pr{e| U=ao} + pi Pr{e|U=az}. 


In the radar field, Pr{e|U=ag} is called the probability of false alarm, and Pr{e|U=a;} is 
called the probability of a miss. Also 1 — Pr{e|U=a;} is called the probability of detection. In 
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statistics, Pr{e|U=a;} is called the probability of error of the second kind, and Pr{e|U=ao} is 
the probability of error of the first kind. These terms are not used here. 


Note that (8.5) partitions the space of observed sample values into 2 regions. Ro = {v: A(v) > n} 
is the region for which U = ag and R,; = {v : A(v) < } is the region for which U = ay. For 
U = aj, an error occurs if and only if v is in Ro, and for U = ao, an error occurs if and only if 
v is in Ry. Thus, 


Pete =a= [ fy (| ao) dv. (8.6) 


eee =aie [ fry (w|ar) dv. (8.7) 


Another, often simpler, approach is to work directly with the likelihood ratio. Since A(v) is 
a function of the observed sample value v, the random variable, A(V), also called a likelihood 
ratio, is defined as follows: for every sample point w, V(w) is the corresponding sample value 
v, and A(V) is then shorthand for A(V(w)). In the same way, U(V) (or more briefly U) is the 
decision random variable. In these terms, (8.5) states that 


U=ao ifandonly if A(V)>7. (8.8) 


Thus, for MAP detection with a threshold 7, 


Pr{e|U=ao} = Pr{U=a, | U=ag} = Pr{A(V) < 4|U=ao}. (8.9) 


Prfe|U=a,} = Pr{U=ap | U=a,} = Pr{A(V) > n|U=ay}. (8.10) 


A sufficient statistic is defined as any function of the observation v from which the likelihood ratio 
can be calculated. As examples, v itself, A(v), and any one-to-one function of A(v) are sufficient 
statistics. A(v), and functions of A(v), are often simpler to work with than v in calculating 
the probability of error. This will be particularly true when vector or process observations are 
discussed, since A(v) is always one dimensional and real. 


We have seen that the MAP rule (and thus also the ML rule) is a threshold test on the likelihood 
ratio. Similarly the min-cost rule, (see Exercise 8.1), and the Neyman-Pearson test (which, as 
shown in the appendix, makes no assumptions about a priori probabilities), are threshold tests 
on the likelihood ratio. Not only are all these binary decision rules based only on threshold 
tests on the likelihood ratio, but the properties of these rules, such as the conditional error 
probabilities in (8.9) and (8.10) are based only on A(V) and 7. In fact, it is difficult to imagine 
any sensible binary decision procedure, especially in the digital communication context, that is 
not a threshold test on the likelihood ratio. Thus, once a sufficient statistic has been calculated 
from the observed vector, that observed vector has no further value in any decision rule of 
interest here. 


The log likelihood ratio, LLR(V) = In[A(V)] is an important sufficient statistic which is often 
easier to work with than the likelihood ratio itself. As seen in the next section, the LLR is 
particularly convenient with Gaussian noise statistics. 
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8.3. Binary signals in white Gaussian noise 


This section first treats standard 2-PAM, then 2-PAM with an offset, then binary signals with 
vector observations, and finally binary signals with waveform observations. 


8.3.1 Detection for PAM antipodal signals 


Consider PAM antipodal modulation (i.e., 2 -PAM), as illustrated in Figure 8.1. 


Input Encoder Baseband |_| Baseband to 
{0,1} | {0,1} +a U=+a] modulator passband 
oe WGN 
Output | Detector Baseband |_| Passband to 
{0,1} | V7U-{0, 1} | V = U+Z | Demodulator baseband 
Figure 8.1: The source produces a binary digit which is mapped into U = +a. This is 
modulated into a waveform, WGN is added, the resultant waveform is demodulated and 


sampled, resulting in a noisy received value V = U + Z. From Section 7.8, Z ~ N(0, No/2). 
This is explained more fully later. Based on this observation the receiver makes a decision 
U and maps this back to the binary output, which is the hypothesized version of the binary 
input. 


The correct hypothesis U is either ag = a or ay = —a. Let Z ~ N(0, No/2) be a Gaussian noise 
rv of mean 0 and variance No/2, independent of U. That is, 


f,(2) = See” Fe] 


Assume that 2-PAM is simplified by sending only a single binary symbol (rather than a sequence 
over time) and by observing only the single sample value v corresponding to that input. As seen 
later, these simplifications are unnecessary, but they permit the problem to be viewed in the 
simplest possible context. The observation V (i.e., the channel output prior to detection) is a+ Z 
or —a+ Z, depending on whether U = a or —a. Thus, conditional on U = a, V ~ N(a, No/2) 
and, conditional on U = —a, V ~ N (=a, No/2). 


1 | —(v—a) 
exp 
VT N 0 Ni 0 
The likelihood ratio is the ratio of these likelihoods, and given by 


A(v) = exp [as ea) = exp ro (8.11) 


fyyela) = 


5) . 


2 
fi fy (| —@)= exp | No 


Substituting this into (8.5), 


<= 7. (8.12) 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


8.3. BINARY SIGNALS IN WHITE GAUSSIAN NOISE 255 


This is further simplified by taking the logarithm, yielding 


dav] >U=2 
LLR(v) = | —] = In(7). 8.13 
w= [Fe] 2" amon (8.13) 
U=a 
co AOR, (8.14) 
<tea 4a 
Figure 8.2 interprets this decision rule. 
(No/4a) Inn 
A 
U=a 


fy l — @) 


—a 0 oe 


Pr{U = alU = —a} 
Figure 8.2: Binary hypothesis testing for antipodal signal, 0 — a,1 — —a. The a priori 
probabilities are po and p,, the threshold is 7 = po/p1, and the noise is N(0, No/2). 


The probability of error, given U= — a, is seen to be the probability that the noise value is 
greater than a+ Ne No n(n) Since the noise has variance No/2, this is the probability that the 


normalized Gaussian rv Z/\/No/2 exceeds a/,/No/2 + \/No/2 In(n)/(2a). Thus, 


Pe{el=~0) = 0 (oa 4 + vee ma (8.15) 


where Q(x), the complementary distribution function of (0,1), is given by 


of peo) 


The probability of error given U=a is calculated the same way, but is the probability that Z is 
less than or equal to —a + son Since —Z has the same distribution as Z, 


Pr{e|U=a}=Q ( i y No/2 =) ‘ (8.16) 


No/2 2a 


It is more insightful to express a/./No/2 as \/2a?/No. As seen before, a? can be viewed as the 
energy per bit, E,, so that (8.15) and (8.16) become 


2Ey Inn 
Q (\ as se (8.17) 
/ l 
Pr{e|U=a) = o( = 5 some) (8.18) 


Pr{e|U=-— a} 
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Note that these formulas involve only the ratio E,/No rather than Fy or No separately. If the 
signal, observation, and noise had been measured on a different scale, then both Ey, and No 
would change by the same factor, helping explain why only the ratio is relevant. In fact, the 
scale could be normalized so that either the noise has variance 1 or the signal has variance 1. 


The hypotheses in these communication problems are usually modeled as equiprobable, po = 
p, = 1/2. In this case, Inn = 0 and MAP detection is equivalent to ML. Eqns. (8.17) and (8.18) 


then simplify to 
2Ey 
Pr{e} = Pr{e|U=— a} = Pr{e| V=a} =Q ( | | : (8.19) 
0 


In terms of Figure 8.2, this is the tail of either Gaussian distribution from the point 0 where 
they cross. This equation keeps reappearing in different guises, and it will soon seem like a 
completely obvious result for a variety of Gaussian detection problems. 


8.3.2 Detection for binary non-antipodal signals 


Next consider the slightly more complex case illustrated in Figure 8.3. Instead of mapping 0 to 
+a and 1 to —a, 0 is mapped to an arbitrary number bo and 1 to an arbitrary number b;. To 
analyze this, let c be the mid-point between bo and by, c = (bp + b1)/2. Assuming b; < bo, let 
a=b)—c=c-—b,. Conditional on U=bo, the observation is V = c+ a+ Z; conditional on 
U=b;, itis V =c—a+Z. In other words, this more general case is simply the result of shifting 
the previous signals by the constant c. 


(No /4a) Inn 
. U=bo 


os 


Fy (ler) Fie (bo) 


by c ne bo 


Pr{T = bo|U = by} 

Figure 8.3: Binary hypothesis testing for arbitrary signals, 0 — bg, 1 — 61, for bg > b;. With 

c = (bop +b1)/2 and a = |bo — b1|/2, this is the same as Figure 8.2 shifted by c. For bo < bi, the 

picture must be reversed, but the answer is the same. 
Define V = V —c as the result of shifting the observation by —c. V is a sufficient statistic and 
V =+a-+ Z. This is the same as the problem above, so the error probability is again given by 
(8.15) and (8.16). 
The energy used in achieving this error probability has changed from the antipodal case. As- 
suming equal a priori probabilities, the energy per bit is now (b2 + b,7)/2 =a? +c. A center 
value c is frequently used as a ‘pilot tone’ in communication for tracking the channel. We see 
that E, is then the sum of the energy used for the actual binary transmission (a?) plus the 


energy used for the pilot tone (c*). The fraction of energy E» used for the signal is y = eae 
This changes (8.19) to 


Pr{e|U=bi} = Pr{e|U=bo} = Q ( ae | (8.20) 
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For example, a common binary communication technique called on-off keying uses the binary 
signals 0 and 2a. In this case, y = 1/2 and there is an energy loss of 3 dB from the antipodal 
case. For ML, the probability of error then becomes, Q(,/ Ey/No). 


8.3.3 Detection for binary real vectors in WGN 


Next consider the vector version of the Gaussian detection problem. Suppose the observation 
is a random n-vector V = U + Z. The noise Z is a random n-vector (2, Z2,...,Zn)', 
independent of U, with iid components given by Z, ~ N(0, No/2). The input U is a random 
n-vector with M possible values (hypotheses). The mth hypothesis, 0 <m < M —1, is denoted 


DY. Gig = rita Coss + Omn)"- A sample value v of V is observed and the problem is to make 
a MAP decision, denoted U, about U. 
Initially assume the binary antipodal case where a, = —ao. For notational simplicity, let ap be 


denoted as @ = (a1, 42,...,@,)'. Thus the two hypotheses are U = a and U = —a and the 
observation is either a+ Z or —a+ Z. The likelihoods are then given by 


= 1 = =(ve= ax)? —||v — all? 
Fee) (1No)"/? aoe No ~ (No)? ( No 

= 1 me —|lv +l)? 
Foye |-a) = CONES No ~ (4No)?/2 Pp No : 


The log likelihood ratio is thus given by 


-|v—alP+let+al? — 4(v, a) 


LL 8.21 
R(v) = a (8.21) 
and the MAP test is 

A(v, a) = U=a Pl 
LLR = In — = 1n 
(v) Ne> -Sohecks Bs (n) 
This can be restated as 
U=a 


The projection of the observation v onto the signal a is oo) Tel: Thus the left side of (8.22) is 
the component of v in the direction of a, thus showing that the decision is based solely on that 
component of v. This result is rather natural; the noise is independent in different orthogonal 
directions, and only the noise in the direction of the signal should be relevant in detecting the 
signal. 


The geometry of the situation is particularly clear in the ML case (see Figure 8.4). The noise is 
spherically symmetric around the origin, and the likelihoods depend only on the distance from 
the origin. The ML detection rule is then equivalent to choosing the hypothesis closest to the 
received point. The set of points equidistant from the two hypotheses, as illustrated in Figure 
8.4, is the perpendicular bisector between them; this bisector is the set of v satisfying (v, a) = 0. 
The set of points closer to a is on the a side of this perpendicular bisector; it is determined by 
(v,a) > 0 and is mapped into a by the ML rule. Similarly, the set of points closer to —a is 
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Figure 8.4: ML decision regions for binary signals in WGN. A vector v on the threshold 
boundary is shown. The distance from v to a is d= ||v — a||. Similarly the distance to —a 
is d’ = ||v + al]. As shown algebraically in (8.21), any point at which d? — d’* = 0 is a point 
at which (v, a) = 0, and thus at which the LLR is 0. Geometrically, from the Pythagorean 
theorem, however, d2 — d’” = d? — a”, where d and d’ are the distances from a and —a to 
the projection of v on the straight line generated by a. This demonstrates geometrically why 
it is only the projection of v onto a that is relevant . 


determined by (v, a) < 0, and is mapped into —a. In the general MAP case, the region mapped 
into a is again separated from the region mapped into —a by a perpendicular to a, but in this 
case it is the perpendicular defined by (v, a) = No In(n)/4. 


Another way of interpreting (8.22) is to view it in a different co-ordinate system. That is, 
choose @, = a/|| a|| as one element of an orthonormal basis for the n-vectors and choose 
another n—1 orthonormal vectors by the Gram-Schmidt procedure. In this new co-ordinate 
system v can be expressed as (v4, v5,...,U;,)', where for 1 < k < n, v, = (v, dx). Since 
(v, a) = |la||(v,o,) = |lal|v}, the left side of (8.22) is simply vj, i.e., the size of the projection 
of v onto a. Thus (8.22) becomes 


suo No In(n) 
1 
<p 4llell 


This is the same as the one-dimensional MAP test in (8.14). In other words, the n-dimensional 
problem is the same as the one dimensional problem when the appropriate co-ordinate system 
is chosen. Actually, the derivation of (8.22) has shown something more, namely that vj is a 
sufficient statistic. The components v},... ,vj,, which contain only noise, cancel out in (8.21) 
if (8.21) is expressed in the new co-ordinate system. The fact that the co-ordinates of v in 
directions orthogonal to the signal do not affect the LLR is sometimes called the theorem of 


irrelevance. A generalized form of this theorem is stated later as Theorem 8.4.2. 


Some additional insight into (8.22) (in the original co-ordinate system) can be gained by writing 
(v, a) as )°,, Upay. This says that the MAP test weights each co-ordinate linearly by the amount 
of signal in that co-ordinate. This is not surprising, since the two hypotheses are separated more 
by the larger components of a than by the smaller. 


Next consider the error probability conditional on U = —a. Given U=—a, V = —a+ Z, and 
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thus 
(V,a) 
= —llall| + (4, 1). 
Ila|| ' 
The mean and variance of this, given U=-— a, are —|la|| and No/2. Thus, (V,a)/|lal| is 


N(-|l\a||, No/2). From (8.22), the probability of error, given U= — a, is the probability that 
N(-|l\a||, No/2) exceeds No In(7)/(4 ||a||). This is the probability that Z is greater than ||a|| + 
No ln(7)/(4 || a]||). Normalizing as in subsection 8.3.1, 


2||al|* Inn 
Prie| U=—-a}= 8.23 
MES SON Na * 2 ahalFING oe 
By the same argument, 
2\\ all? l 
Pr{e | U=a} =Q ll all (8.24) 


No 2y/2ilal|?/No 


It can be seen that this is the same answer as given by (8.15) and (8.16) when the problem is 
converted to a coordinate system where a is collinear with a coordinate vector. The energy per 
bit is E, = ||al|?, so that (8.17) and (8.18) follow as before. This is not surprising, of course, 
since this vector decision problem is identical to the scalar problem when the appropriate basis 
is used. 


For most communication problems, the a priori probabilities are assumed to be equal so that 


n = 1. Thus, as in (8.19), 
Pr{e}=Q (4 (8.25) 


This gives us a useful sanity check - the probability of error does not depend on the orthonormal 
coordinate basis. 


Now suppose that the binary hypotheses correspond to non-antipodal vector signals, say bg and 
b,. We analyze this in the same way as the scalar case. Namely, let c = (bo + b1)/2 and 
a = by — c. Then the two signals are bb) = a+ c and bj = —a-+ ec. As before, converting 
the observation V to V = V —c shifts the midpoint and converts the problem back to the 
antipodal case. The error probability depends only on the distance 2]|a@|| between the signals and 
is given by (8.23) and (8.24). The energy per bit is again different, and assuming equiprobable 
input vectors, the energy per bit is Ey = |/a||? + ||e||?. Thus the center point ¢ contributes to 
the energy, but not to the error probability. 


It is often more convenient, especially when generalizing to M > 2 hypotheses, to express the 
LLR for the non-antipodal case directly in terms of bp and 6;. Using (8.21) for the shifted 


vector V, the LLR can be expressed as 


|v = boll? + lle — bill? 
No 


LLR(v) = (8.26) 


For ML detection, this is simply the minimum distance rule, and for MAP, the interpretation is 


the same as for the antipodal case. 
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8.3.4 Detection for binary complex vectors in WGN 


Next consider the complex vector version of the same problem. Assume the observation is a 
complex random n-vector V = U + Z. The noise, Z = (Z,...,Zn)', is a complex ran- 
dom vector of n zero-mean complex iid Gaussian rv’s with iid real and pe parts, each 
N(0,.No/2). Thus each Z, is circularly symmetric and denoted by CN(0, No). The input U is 
independent of Z and binary, taking on value a with probability pp and —a@ with probability p; 
where a = (aj,... ,@n)' is an arbitrary complex n-vector. 


This problem can be reduced to that of the last subsection by letting Z’ be the 2n dimensional 
real random vector with components R(Z;,) and S(Z;,) for 1 < k <n. Similarly let a’ be the 2n 
dimensional real vector with components R(az,) and S(ax) for 1 < k <n and let U' be the real 
random vector that takes on values a’ or —a’. Finally, let V’ = U'+ Z’. 


Recalling that probability densities for complex random variables or vectors are equal to the 
joint probability densities for the real and imaginary parts, 


fru(tla) = fyyg(v'la’) = — 


—R/ 2_ 2 
an (vp + ax)? — S(v_ + ax) . 


Gy 
! ! 1 
fyy(el-@) = frig (?'| a’) = (7No)” No 


The LLR is then 


=||v = al)? + |v + all? 


LL = 8.27 
R(v) ~ (8.27) 
Note that 
|v — all? = |le||? — (v, a) — (a, v) + lla]? = lol? — 2R[(v, a) + lla)? 
Using this and the analagous expression for ||v + a||?, (8.27) becomes 
AR 
pays hee (8.28) 
No 
The MAP test can now be stated as 


lal] <ge 
Note that the value of the LLR and the form of the MAP test are the same as the real vector case 
except for the real part of (v, a). The significance of this real part operation is now discussed. 


In the n-dimensional complex vector space, (v, a) /||a|| is the complex value of the projection of 
v in the direction of a. In order to understand this projection better, consider an orthonormal 
basis in which a = (1,0,0,...,0)". Then (v,a)/||a|| = v1. Thus R(v1) = 41+ R(z1) and 
S(v1) = S(z1). Clearly, only (v1) is relevant to the binary decision. Using #[(v, a) /||a||] in 
(8.29) is simply the general way of stating this elementary idea. If the complex plane is viewed 
as a 2-dimensional real space, then taking the real part of (v,a@) is equivalent to taking the 
further projection of this two dimensional real vector in the direction of a (see Exercise 8.12). 
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The other results and interpretations of the last subsection remain unchanged. In particular, 


since ||a’|| = ||a||, the error probability results are given by 
2\|al| Inn 
Pr{e| U=—a} = Q 8.30 
nek Vo No * 2 /2TaIF7N aa 
2|| || Inn 


Pr{e| U=a}- =. <Q (8.31) 


No 2y/2\|a|/?/No 
For the ML case, recognizing that ||a||? = Ey, we have the familiar result 


PeereO ( =) (8.32) 


Finally, for the non-antipodal case with hypotheses bo and b1, the LLR is again given by (8.26). 


8.3.5 Detection of binary antipodal waveforms in WGN 


This section extends the vector case of the previous two subsections to the waveform case. 
It will be instructive to do this simultaneously for both passband real random processes and 
baseband complex random processes. Let U(t) be the baseband modulated waveform. As 
before, the situation is simplified by transmitting a single bit rather than a sequence of bits, 
so for some arbitrary, perhaps complex, baseband waveform a(t), the binary input 0 is mapped 
into U(t) = a(t) and 1 is mapped into U(t) = —a(t); the a priori probabilities are denoted by po 
and p;. Let {6,(t);k € Z} be a complex orthonormal expansion covering the baseband region 
of interest, and let a(t) = )°,, a,Ox(t). 


Assume U(t) = ta(t) is modulated onto a carrier f, larger than the baseband bandwidth. The 
resulting bandpass waveform is denoted X(t) = +b(t) where, from Section 7.8, the modulated 
form of a(t), denoted b(t), can be represented as 


b(t) = So beWe.r(t) + be2e,2(t) 
k 


where 


bi = Raw); Ver (t) = R{20,(t) exp[27i fet] }; 

b,2 = S(ax); Vpa(t) = —S{2Og(t) exp[277 fet] }- 
From Theorem 6.6.1, the set of waveforms {wx ,j(t);k € Z,j € {1,2}} are orthogonal, each with 
energy 2. Let {@m(t);m € Z} be a set of orthogonal functions, each of energy 2 and each 


orthogonal to each of the wz,;(t). Assume that {¢m(t);m € Z}, together with the ~,j(t), span 
Lo. 


The noise W(t), by assumption, is WGN. It can be represented as 
Wt) = So (Zeer) + Zr2dn2(t)) + > Wndm(t); 
k m 


where {Zkm;k € Z,m € {1, 2}} is the set of scaled linear functionals of the noise in the £2 vector 
space spanned by the ~m(t), and {Wm;m € Z} is the set of linear functionals of the noise in 
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the orthogonal complement of the space. As will be seen shortly, the joint distribution of the 
Wm makes no difference in choosing between a(t) and —a(t), so long as the Wm are independent 
of the Z,,; and the transmitted binary digit. The observed random process at passband is then 
Y(t) = X(t) + W(8), 


Y(t) = Ss" [Yer Ve. (t) + Yeove,2(t)] + SS Wmdm(t) where 


k 


Vea = (Soe Pea) Yung = (4b¢,.2 + Zp,2) - 


First assume that a finite number n of orthonormal functions are used to represent a(t). This 
is no loss of generality, since the single function a(t) /||a(t)|| would be sufficient. Suppose also, 
initially, that only a finite set, say W1,... , Wg, of the orthogonal noise functionals are observed. 
Assume also that the noise variables, Z;,; and Wm are independent and each* (0, No/2). Then 
the likelihoods are given by 


1 ue oe (Yk j by ae : —w?2 
b _— ’ , m : 
Fee | ) (xNo)” exp » » No ) No 


k=1 j=l m=1 
1 PO (Ying + beg)? SO —w? 
fyix(Y | —b) = exp Ss" = 2 m 
(7No)” fal jal No aa ND 
The log likelihood ratio is thus given by 
n 2 
~ ~ ‘ore by, 5)? + (YK,3 + bx)? 
LLR(y) = bes : : Ni : - 
k=1 j=1 0 
j 
=lly — bl? + Ily + ell? 
No (8.33) 
2 
= eo 4yn.jbk.g — ACY, b) (8.34) 
mie” No 
— et 


and the MAP test is 


X=b 
(ys) 2 = SE 
X=-b 
This is the same as the real vector case analyzed in Subsection 8.3.3. In fact, the only difference 
is that the observation here includes noise in the degrees of freedom orthogonal to the range of 
interest, and the derivation of the LLR shows clearly why these noise variables do not appear 
in the LLR. In fact, the number @ of rv’s W,, can be taken to be arbitrarily large, and they can 
have any joint density. So long as they are independent of the Z;,,; (and of X(t)), they cancel 
out in the LLR. In other words, WGN is noise that is iid Gaussian over a large enough space to 
represent the signal, and is independent of the signal and noise elsewhere. 


‘Recall that No/2 is the noise variance using the same scale as used for the signal waveform. Since the input 
energy is measured at baseband, the noise is also. At passband, the signal energy is scaled up by a factor of 2, 
and the noise energy is similarly scaled. 
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The argument above leading to (8.33) and (8.34) is not entirely satisfying mathematically, since 
it is based on the slightly vague notion of the signal space of interest, but in fact it is just this 
feature that makes it useful in practice, since physical noise characteristics do change over large 
changes in time and frequency. 


The inner product in (8.34) is the inner product over the £2 space of real sequences. Since these 
sequences are coefficients in an orthogonal (rather than orthonormal) expansion, the conversion 
to an inner product over the corresonding functions (see Exercise 8.5) is given by 


So wiibis = 5 / y(t)b(t) dt. (8.35) 
kj 


This shows that the LLR is independent of the basis, and that this waveform problem reduces 
to the single dimensional problem if b(t) is a multiple of one of the basis functions. Also, if a 
countably infinite basis for the signal space of interest is used, (8.35) is still valid. 


Next consider what happens when Y(t) = +b(t)+W (t) is demodulated to the baseband waveform 
V(t). The component )>,,, Wm(t) of Y(t) extends to frequencies outside the passband, and thus 
Y(t) is filtered before demodulation, preventing an aliasing like effect between }°,, Wm(t) and 
the signal part of Y(t) (see Exercise 6.11). Assuming that this filtering does not affect b(t), b(t) 
maps back into a(t) = >>, a,9,(t) where aj, = by, + iby.2. Similarly W(t) maps into 


Z(t) = S> ZpOn(t) + Z(t) 
k 


where Z, = Zei + iZp2 and Z(t) is the result of filtering and frequency demodulation on 
em Wndm(t). The received baseband complex process is then 


V(t) = S° V0 (t) + Z, (t) where V; = tay + Zy. (8.36) 
k 


By the filtering assumption above, the sample functions of Z,(t) are orthogonal to the space 
spanned by the 6;(t) and thus the sequence {V;,;k € Z} is determined from V(t). Since Vz = 
Yn +1Y;,2, the sample value LLR(y) in (8.34) is determined as follows by the sample values of 
{vg; k € Z}, 


Aly, b) _ 4R{(v, a) 


LL = = 
R(y) No No 


(8.37) 
Thus {vz;k € Z} is a sufficient statistic for y(t), and thus the MAP test based on y(t) can be 
done using v(t). Now an implementation that first finds the sample function v(t) from y(t) and 
then does a MAP test on v(t) is simply a particular kind of test on y(t), and thus cannot achieve 
a smaller error probability than the MAP test on y. Finally, since {vz;k € Z} is a sufficient 
statistic for y(t), it is also a sufficient statistic for v(t) and thus the orthogonal noise Z, (t) is 
irrelevant. 


Note that the LLR in (8.37) is the same as the complex vector result in (8.28). One could repeat 
the argument there, adding in an orthogonal expansion for Z(t) to verify the argument that 
Z(t) is irrelevant. Since Z, (t) could take on virtually any form, the argument above, based on 
the fact that Z, (t) is a function of }>,,, Wm@m(t), which is independent of the signal and noise 
in the signal space, is more insightful. 
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To summarize this subsection, the detection of a single bit sent by generating antipodal signals 
at baseband and modulating to passband has been analyzed. After adding WGN, the received 
waveform is demodulated to baseband and then the single bit is detected. The MAP detector at 
passband is a threshold test on [ y(t)b(t) dt. This is equivalent to a threshold test at baseband 
on ¥[f{ v(t)a*(t) dt]. This shows that no loss of optimality occurs by demodulating to baseband 
and also shows that detection can be done either at passband or at baseband. In the passband 
case, the result is an immediate extension of binary detection for real vectors, and at baseband, 
it is an immediate extension of binary detection of complex vectors. 


The results of this section can now be interpreted in terms of PAM and QAM, while still assuming 
a “one-shot” system in which only one binary digit is actually sent. Recall that for both PAM 
and QAM modulation, the modulation pulse p(t) is orthogonal to its T-spaced time shifts if 
\(f)|? satisfies the Nyquist criterion. Thus, if the corresponding received baseband waveform 
is passed through a matched filter (a filter with impulse response p*(t)) and sampled at times 
kT, the received samples will have no intersymbol interference. For a single bit transmitted at 
discrete time 0, u(t) = +a(t) = ap(t). The output of the matched filter at receiver time 0 is 


then . 
[oww'wa= ui 


a 


which is a scaled version of the LLR. Thus the receiver from Chapter 6 that avoids intersymbol 
interference also calculates the LLR, from which a threshold test yields the MAP detection. 


The next section shows that this continues to provide MAP tests on successive signals. It 
should be noted also that sampling the output of the matched filter at time 0 yields the MAP 
test whether or not p(t) has been chosen to avoid intersymbol interference. 


It is important to note that the performance of binary antipodal communication in WGN de- 
pends only on the energy of the transmitted waveform. With ML detection, the error probability 
is the familiar expression QF) where E, = f |a(t)|* dt and the variance of the noise in each 


real degree of freedom in the region of interest is No/2. 


This completes the analysis of binary detection in WGN, including the relationship between the 
vector case and waveform case and that between complex waveforms or vectors at basebande 
and real waveforms or vectors at passband. 


The following sections analyze M-ary detection. The relationships between vector and waveform 
and between real and complex is the same as above, so the following sections each assume 
whichever of these cases is most instructive without further discussion of these relationships. 


8.4 M-ary detection and sequence detection 


The analysis in the previous section was limited in several ways. First, only binary signal 
sets were considered, and second, only the ‘one-shot’ problem where a single bit rather than 
a sequence of bits was considered. In this section, M-ary signal sets for arbitrary M will be 
considered, and this will then be used to study the transmission of a sequence of signals and to 
study arbitrary modulation schemes. 
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8.4.1 M-ary detection 


Going from binary to M-ary hypothesis testing is a simple extension. To be specific, this will 
be analyzed for the complex random vector case. Let the observation be a complex random 
n-vector V and let the complex random n-vector U to be detected take on a value from the set 
{a@o,... ,@y—1} with a priori probabilities po,... ,pa¢_1. Denote the a posteriori probabilities 
by Pyjy(@m|v). The MAP rule (see Section 8.1) then chooses U(v) = arg max,,, Puy (@m|v)- 
Assuming that the likelihoods can be represented as probability densities f the MAP rule 
can be expressed as 


V|U? 


U(v) > arg max, Pm fyjy(vlam)- 


Usually, the simplest approach to this M-ary rule is to consider multiple binary hypothesis 
testing problems. That is, U(v) is that a,, for which 


fy (vlam) SS Pm! 
fyjy (Ulam) ~ Pm 


Need (v) = 


for all m’. In the case of ties, it makes no difference which of the maximizing hypotheses are 
chosen. 


For the complex vector additive WGN case, the observation is V = U + Z where Z is complex 
Gaussian noise with iid real and imaginary components. As derived in (8.27), the log likelihood 
ratio (LLR) between each pair of hypotheses a,,, and a, is given by 


aes 2 9 
LR yoe(a) = 22 Onl + fo ~ a 
No 


AP (8.38) 


Thus each binary test separates the observation space® into two regions separated by the per- 
pendicular bisector between the two points. With M hypotheses, the space is separated into 
the Voronoi regions of points closest to each of the signals (hypotheses) (see Figure 8.5). If 
the a priori probabilities are unequal, then these perpendicular bisectors are shifted, remaining 
perpendicular to the axis joining the two signals, but no longer being bisectors. 


Figure 8.5: Decision regions for an M-ary alphabet of vector signals in iid Gaussian noise. For 
ML detection, the decision regions are Voronoi regions, 7.e., regions separated by perpendicular 
bisectors between the signal points. 


5 . . . . . . . . 
°For an n dimensional complex vector space, it is simplest to view the observation space as the corresponding 
2n dimensional real vector space. 
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The probability that noise carries the observation across one of these perpendicular bisectors is 
given in (8.29). The only new problem that arises with M-ary hypothesis testing is that the error 
probability, given U = m, is the union of M —1 events, namely crossing the corresponding per- 
pendicular to each other point. This can be found exactly by integrating over the n dimensional 
vector space, but is usually upper bounded and approximated by the union bound, where the 
probability of crossing each perpendicular is summed over the M —1 incorrect hypotheses. This 
is usually a good approximation (if MW is not too large), because the Gaussian density decreases 
so rapidly with distance; thus in the ML case, most errors are made when observations occur 
roughly half way between the transmitted and the detected signal point. 


8.4.2 Successive transmissions of QAM signals in WGN 


This subsection extends the ‘single-shot’ analysis of detection for QAM and PAM in the presence 
of WGN to the case in which an n-tuple of successive independent symbols are transmitted. We 
shall find that under many conditions, both the detection rule and the corresponding probability 
of symbol error can be analyzed by looking at one symbol at a time. 


First consider a QAM modulation system using a modulation pulse p(t). Assume that p(t) has 
unit energy and is orthonormal to its T-spaced shifts {p(t—kT); k € Z}, i.e., that {p(t—kT); k € 
Z} is a set of orthonormal functions. Let A = {a,,...,a,,_,} be the alphabet of complex input 
signals and denote the input waveform over an arbitrary n-tuple of successive input signals as 


u(t) = 5° ugp(t — kT), 
rs 


where each uz is a selection from the input alphabet A. 


Let {¢,(t); & > 1} be an orthonormal basis of complex £2 waveforms such that the first n 
waveforms in that basis are given by @,(t) = p(t — kT), 1 < k < n. The received baseband 
waveform is then 


n 


V(t) = S_ Vebe(t) = S(ur + Ze)p(t — kT) + SY > Ze ort). (8.39) 
k=1 


k=1 k>n 


We now compare two different detection schemes. In the first, a single ML decision between the 
M” hypotheses for all possible joint values of U;,...,Un is made based on V(t). In the second 
scheme, for each k,1 <k <n, an ML decision between the M possible hypotheses a)... ,a,,_, 
is made for input U;, based on the observation V(t). Thus in this scheme, n separate M-ary 


decisions are made, one for each of the n successive inputs. 


For the first alternative, each hypothesis corresponds to an n dimensional vector of inputs, 
u = (m,...,Un)'. As in Subsection 8.3.5, the sample value u(t) = 5°, veds(t) of the received 
waveform can be taken as an f-tuple v = (v1, v2,...,v¢)' with @ > n. The likelihood of v 
conditional on wu is then given by 


n £ 
fyju(vlu) = [] f. (on — ux) II fz (vk)- 
k=1 


k=n+1 


For any two hypotheses, say u = (ui,... , Un)" and u’ = (u},... , u},)", the likelihood ratio and 
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LLR are 
“7 fz (on — Uk) 
Auw(v) = JAN TR (8.40) 
U,U II 2 (Uk 27 ul) 
v — ull? + |lv — uw’ ||? 
LLRy.w(v) : (8.41) 
? No 
Note that for each k > n, vz does not appear in this likelihood ratio. Thus this likelihood ratio 
is still valid® in the limit @ — 00, but the only relevant terms in the decision are v1,... , Un- 
Therefore let v = (v1,... ,Un)' in what follows. From (8.41), this likelihood ratio is positive if 
and only if ||v — u|| < ||v — u’||. The conclusion is that for M”-ary detection, done jointly on 
U1,--.,Un, the ML decision is the vector u that minimizes the distance ||v — ull. 


Consider how to minimize ||v — ul]. Note that 


n 


lv — ul? = 52 (op — ug)? (8.42) 


k=1 


Suppose that @ = (i1,... ,%)’ minimizes this sum. Then for each k, i, minimizes (vz — uz)? 
over the M choices for uz (otherwise some am # tz could be substituted for ti, to reduce 
(vp — uz)? and therefore reduce the sum in (8.42)). Thus the ML sequence detector with M” 
hypotheses detects each U;, by minimizing (vz — uz)? over the M hypotheses for that U,. 


Next consider the second alternative above. For a given sample observation v = v1,... , vg and 
a given k, 1 <k <n, the likelihood of v conditional on Uz, = ux is 


L 
Fru, (olue) = fe(ve—ue) [] fo) TD fee) 


GFR IS J<n j=ntl 


where fy, (a) Sa fh, Iv, (vj|@m) is the marginal probability of V;. The likelihood ratio of 
v between the hypotheses U;, = ay, and U;, = am is then 


A) 


m,m! 


(v) = fe (UK a Am) 


fe (UE — Am! 


This is the familiar one-dimensional non-antipodal Gaussian detection problem, and the ML 
decision is to choose ti, as the a, closest to uz. Thus, given the sample observation v(t), the 
vector (ti1,...,t%n)' of individual M-ary ML detectors for each U;, is the same as the M”-ary 
ML sequence detector for the sequence U = (Uj,...,Un)'. Moreover, each of these detectors 
are equivalent to a vector of ML decisions on each U; based solely on the observation Vj. 


Summarizing, we have proved the following theorem: 


Theorem 8.4.1. Let U(t) = )°/_, Ugp(t—kT) be a QAM (or PAM) baseband input to a WGN 
channel and assume that {p(t —nT);1<k <n} is an orthonormal sequence. Then the M"-ary 
ML decision on U= (Uj,...Un)" is equivalent to making separate M-ary ML decisions on each 
Up, 1<k <n, where the decision on each U;, can be based either on the observation v(t) or the 
observation of vp. 


In fact, these final £—n components do not have to be independent or equally distributed, they simply must 
be independent of the signals and noise for 1 <k <n. 
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Note that the theorem states that the same decision is made for both sequence detection and 
separate detection for each signal. It does not say that the probability of an error within the 
sequence is the same as the error for a single signal. Letting P be the probability of error for a 


n 


single signal, the probability of error for the sequence is 1 — (1 — P)”. 


The theorem makes no assumptions about the probabilities of the successive inputs, although 
the use of ML detection would not minimize the probability of error if the inputs were not 
independent and equally likely. If coding is used between the n input signals, then not all of 
these WM” n-tuples are possible. In this case, ML detection on the possible encoded sequences (as 
opposed to all MM” sequences) is different from separate detection on each signal. As an example, 
if the transmitter always repeats each signal, with u; = ue, uz = ua, etc., then the detection of 
u 1 should be based on both v1 and v9. Similarly, the detection of uz should be based on v3 and 
v4, etc. 


When coding is used, it is possible to make ML decisions on each signal separately, and then 
to use the coding constraints to correct errors in the detected sequence. These individual signal 
decisions are then called hard decisions. It is also possible, for each k, to save a sufficient 
statistic (such as vz) for the decision on Uz. This is called a soft decision since it saves all the 
relevant information needed for an ML decision between the set of possible codewords. Since 
the ML decision between possible encoded sequences minimizes the error probability (assuming 
equi-probable codewords), soft decisions allow for smaller error probabilities than hard decisions. 


Theorem 8.4.1 can be extended to MAP detection if the input signals are statistically indepen- 
dent of each other (see Exercise 8.15). One can see this intuitively by drawing the decision 
boundaries for the two-dimensional real case; these decision boundaries are then horizontal and 
vertical lines. 


A nice way to interpret Theorem 8.4.1 is to observe that the detection of each signal U; de- 
pends only on the corresponding received signal V;; all other components of the received vector 
are irrelevant to the decision on Uz. The next subsection generalizes from QAM to arbitrary 
modulation schemes and also generalizes this notion of irrelevance. 


8.4.3 Detection with arbitrary modulation schemes 


The previous sections have concentrated on detection of PAM and QAM systems, using real 
hypotheses A = {ao,... ,@—1} for PAM and complex hypotheses A = ao,... ,@yy—1 for QAM. 
In each case, a sequence {uz;k € Z} of signals from A is modulated into a baseband waveform 
u(t) = >>, ugp(t — kT). The PAM waveform is then either transmitted or first modulated to 
passband. The complex QAM waveform is necessarily modulated to a real passband waveform. 


This is now generalized by considering a signal set A to be an M-ary alphabet, {ao,... , ayv_1}, 
of real n-tuples. Thus each a,, is an element of R”. The n components of the mth signal vector 
are denoted by @m = (dm,1,--- m,n)’. The selected signal vector a, is then modulated into 
a signal waveform bm(t) = d>p_, @m,ne(t) where {¢1(t),... ,dn(t)} is a set of n orthonormal 
waveforms. 


The above provides a general scenario for mapping the symbols 0 to M — 1 into a set of signal 
waveforms b(t) to byy_1(t). A provision must also be made for transmitting a sequence of such 
M-ary symbols. If these symbols are to be transmitted at T-spaced intervals, the most straight- 
forward way of accomplishing this is to choose the orthonormal waveforms ¢)(t),... ,¢n(t) in 
such a way that ¢;(t — @T) and ¢,(t—’T) are orthonormal for all j,k, 1 < j,k < n and all 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


8.4. M-ARY DETECTION AND SEQUENCE DETECTION 269 


integer @,¢. In this case, a sequence of symbols, say s1,82,..., each drawn from the alpha- 
bet {0,... ,—1}, could be mapped into a sequence of waveforms bs, (t), bs,(t — T),.... The 
transmitted waveform would then be )°, bs,(t — @T). 


PAM is a special case of this scenario where the dimension n is 1. The function ¢;(t) in this case 
is the real modulation pulse p(t) for baseband transmission and V2 p(t) cos(27 ft) for passband 
transmission. QAM is another special case where n is 2 at passband. In this case, the complex 
signals a,, are viewed as 2-dimensional real signals. The orthonormal waveforms (assuming real 
p(t)) are o1(t) = V2 p(t) cos(27 fet) and V2 p(t) sin(2z7f,t). 

More generally, it is not necessary to start at baseband and shift to passband”, and it is not 
necessary for successive signals to be transmitted as time shifts of a basic waveform set. For 
example, in frequency-hopping systems, successive n-dimensional signals can be modulated to 
different carrier frequencies. What is important is that the successive transmitted signal wave- 
forms are all orthogonal to each other. 


Let X(t) be the first signal waveform in such a sequence of successive waveforms. Then X(t) is a 
choice from the set of M waveforms, bo(t),... , bav—i(t). We can represent X(t) as 77, Xxnox(t) 
where, under hypothesis m, Xp = @m,, for 1 << k <n. Let @n41(t), n+o(t)... be an additional 
set of orthonormal functions such that the entire set {¢;(t);k > 1} spans the space of real 
Ly waveforms. The subsequence ¢n+41(t), dn+2(t)... might include the successive time shifts of 
o1(t),... ,¢n(t) for the example above, but in general can be arbitrary. We do assume, however, 
that successive signal waveforms are orthogonal to ¢1(t),... ,@n(t), and thus that they can be 
expanded in terms of $7,41(t), n+4o(t),...,. The received random waveform Y(t) is assumed to 
be the sum of X(t), the WGN Z(t), and contributions of signal waveforms other than X. These 
other waveforms could include successive signals from the given channel input and also signals 
from other users. This sum can be expanded over an arbitrarily large number, say @, of these 
orthonormal functions as 


L n L 
Y(t) = S_Vede(t) = S-(Xe + Zedn(t) + So Vide). (8.43) 
k=1 k=1 k=n+1 


Note that in (8.43), the random process {Y(t); t € R} specifies the random variables Yj,... , Y¢. 
Assuming that the sample waveforms of Y(t) are La, it also follows that the limit as 0 — 0 of 
Yi,..., Ye specifies Y(t) in the £2 sense. Thus we consider Yj,... , Y¢ to be the observation at 
the channel output. It is convenient to separate these terms into two vectors, Y = (Yi,..., Yn)" 
and Y= (Vidiyente Ye) 

Similarly, the WGN Z(t) = )°, Zrdx(t) can be represented by Z = (Zj,...,Z,)' and 
Z’ = (Zny1,---,Ze)' and X(t) can be represented as X = (Xj,...,Xn)'. Finally let 
V(t) = 3°, Vebe(t) be the contributions from other users and successive signals from the 


given user. Since these terms are orthogonal to ¢;(t),... ,@n(t), V(t) can be represented by 
V' = (Vayi,--- , Ve)’. With these changes, (8.43) becomes 


Y=X4+Z, YWeHZ'+V". (8.44) 


The observation is a sample value of (Y, Y’), and the detector must choose the MAP value 
of X. Assuming that X,Z,Z’, and V’ are statistically independent, the likelihoods can be 


"It seems strange at first that the real vector and real waveform case here is more general than the complex 
case, but the complex case is used for notational and conceptual simplifications at baseband, where the baseband 
waveform will be modulated to passsband and converted to a real waveform. 
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expressed as 


fry x(yy'|am) = fa(y — am) fyi(y’). 
The likelihood ratio between hypothesis a, and a, is then given by 


_ faly ~ am) 


erat (8.45) 


Amm'(y) 
The important thing here is that all the likelihood ratios (for 0 < m,m! < M-—1) depend only 
on Y and thus Y is a sufficient statistic for a MAP decision on X. Y’ is irrelevant to the 
decision, and thus its probability density is irrelevant (other than the need to assume that Y’ 
is statistically independent of (Z,X)). This also shows that the size of @ is irrelevant. This 
is summarized (and slightly generalized by dropping the Gaussian noise assumption) in the 
following theorem. 


Theorem 8.4.2 (Theorem of irrelevance). Let {@,(t);k > 1} be a set of real orthonormal 
functions. Let X(t) = oy_, Xnbp(t) and Z(t) = op_, Zeox(t) be the input to a channel and the 
corresponding noise respectively, where X = (X1,...,Xn)" and Z=(%,...,Zn)" are random 
vectors. Let Y'(t) = ops, Yebu(t) where for each €>n, Y' = (Yn4i,---, Ye)" is a random 
vector that is statistically independent of the pair X, Z. Let Y= X+ Z. Then the LLR and the 
MAP detection of X from the observation of Y, Y’ depends only on Y. That is, the observed 
sample value of Y' is irrelevant. 


The orthonormal set {¢1(t),... , @n(t)} chosen above appears to have a more central importance 
than it really has. What is important is the existence of an n-dimensional subspace of real Lo 
that includes the signal set and has the property that the noise and signals orthogonal to this 
subspace are independent of the noise and signal within the subspace. In the usual case, we 
choose this subspace to be the space spanned by the signal set, but there are also cases where 
the subspace must be somewhat larger to provide for the independence between the subspace 
and its complement. 


The irrelevance theorem does not specify how to do MAP detection based on the observed 
waveform, but rather shows how to reduce the problem to a finite dimensional problem. Since the 
likelihood ratios specify both the decision regions and the error probability for MAP detection, 
it is clear that the choice of orthonormal set cannot influence either the error probability or the 
mapping of received waveforms to hypotheses. 


One important constraint in the above analysis is that both the noise and the interference (from 
successive transmissions and from other users) are additive. The other important constraint is 
that the interference is both orthogonal to the signal X(t) and also statistically independent of 
X(t). The orthogonality is why Y = X + Z, with no contribution from the interference. The 
statistical independence is what makes Y’ irrelevant. 


If the interference is orthogonal but not independent, then a MAP decision based on Y alone 
could still be made. The resulting error probability, however, would be greater than or equal to 
that for a MAP decision based on {Y, Y’}. Thus the dependence generally permits a decrease 
in error probability 


On the other hand, if the interference is non-orthogonal but independent, then Y would include 
both noise and a contribution from the interference, and the error probability would typically be 
larger, but never smaller, than in the orthogonal case. As a rule of thumb, then, non-orthogonal 
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interference tends to increase error probability, whereas dependence (if the receiver makes use 
of it) tends to reduce error probability. 


If successive statistically independent signals, X1, X2,..., are modulated onto distinct sets of 
orthonormal waveforms (i.e., if X; is modulated onto the orthonormal waveforms ¢1(t) to ¢,,(¢)), 
X2 is modulated onto ¢n+1(t) to dan(t), etc.) then it also follows, as in Subsection 8.4.2, that 
ML detection on a sequence X1,... ,X¢ is equivalent to separate ML decisions on each input 
signal X;, 1 < j < ¢. The details are omitted since the only new feature in this extension is 
more complicated notation. 


The higher dimensional mappings allowed in this subsection are sometimes called channel codes, 
and are sometimes simply viewed as more complex forms of modulation. The coding field is 
very large, but the following sections provide an introduction. 


8.5 Orthogonal signal sets and simple channel coding 


An orthogonal signal set is a set ao,... , @yg¢_1 of M real orthogonal M-vectors, each with the 
same energy E. Without loss of generality we choose a basis for R™ in which the mth basis vector 
is @m/VE. In this basis, a9 = (VE,0,0,... ,0)', ay = (0, VE,0,... ,0)", etc. Modulation onto 
an orthonormal set {@m(t)} of waveforms then maps hypothesis am (0 < m < M-1) into the 
waveform VE¢,,(t). After addition of WGN, the sufficient statistic for detection is a sample 
value y of Y = A+ Z where A takes on the values ao,... , @y¢_1 with equal probability and 
Z = (Zo,...,Zu-1)" has iid components V(0, No/2). It can be seen that the ML decision is 
to decide on that m for which y, is largest. 


The major case of interest for orthogonal signals is where M is a power of 2, say M = 2°. Thus 
the signal set can be used to transmit b binary digits, so the energy per bit is Ey = E'/b. The 
number of required degrees of freedom for the signal set, however, is M = 2° ,so the spectral 
efficiency p (the number of bits per pair of degrees of freedom) is then p = b/2°-!. As b gets 
large, p gets small at almost an exponential rate. It will be shown, however, that for large enough 
Ep, as b gets large holding E, constant, the ML error probabiliity goes to 0. In particular, for 
any E,/No < In2 = 0.693, the error probability goes to 0 exponentially as b > co. Recall that 
In 2 = 0.698, t.e., -1.59 dB, is the Shannon limit for reliable communication on a WGN channel 
with unlimited bandwidth. Thus the derivation to follow will establish the Shannon theorem for 
WGN and unlimited bandwidth. Before doing that, however, two closely related types of signal 
sets are discussed. 


8.5.1 Simplex signal sets 


Consider the random vector A with orthogonal equiprobable sample values a@o,... , @j¢_1 as 
described above. The mean value of A is then 
= 
—_ [VE VE VE 
ane Nee ee 
We have seen that if a signal set is shifted by a constant vector, the Voronoi detection regions are 
also shifted and the error probability remains the same. However, such a shift can change the 
expected energy of the random signal vector. In particular, if the signals are shifted to remove 
the mean, then the signal energy is reduced by the energy (norm squared) of the mean. In this 
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case, the energy of the mean is F/M. A simplex signal set is an orthogonal signal set with the 
mean removed. That is, 


S=A-—A; 8m=an,—A; 0<m<M-1. 


In other words, the mth component of s, is VE(1—1/M) and each other component is 
—JE/M. Each simplex signal has energy E(1—1/M), so the simplex set has the same er- 
ror probability as the related orthogonal set, but requires less energy by a factor of (1—1/M). 
The simplex set of size M has dimensionality M — 1, as can be seen from the fact that the sum 
of all the signals is 0, so the signals are linearly dependent. Figure 8.6 illustrates the orthogonal 
and simplex sets for M = 2 and 3. 


For small M, the simplex set is a substantial improvement over the orthogonal set. For example, 
for M = 2, it has a 3 dB energy advantage (it is simply the antipodal one dimensional set). 
Also it uses half the dimensions of the orthogonal set. For large M, however, the improvement 
becomes almost negligible. 


Orthogonal Simplex Biorthogonal 
0,1 
@ e 
e 
0,1,0 
e e 
0,0,1 ars 
M=3 > 1,0,0 N 
e 


Figure 8.6: Orthogonal, simplex, and bi-orthogonal signal constellations, normalized to unit energy. 


8.5.2 Bi-orthogonal signal sets 


If a,,...,@,,_, is a set of orthogonal signals, we call the set of 2M signals consisting of 
gree» £a,,_, a bi-orthogonal signal set. Two and three dimensional examples of bi-orthognal 
signals sets are given in figure 8.6. 


ra 


It can be seen by the same argument used for orthogonal signal sets that the ML detection rule 
for such a set is to first choose the dimension m for which |y,,| is largest, and then choose am, 
or —@,,, depending on whether yj, is positive or negative. Orthogonal signal sets and simplex 
signal sets each have the property that each signal is equidistant from every other signal. For 
bi-orthogonal sets, each signal is equidistant from all but one of the other signals. The exception, 
for the signal a,, is the signal —a,. 
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The bi-orthogonal signal set of IM dimensions contains twice as many signals as the orthogonal 
set (thus sending one extra bit per signal), but has the same minimum distance between signals. 
It is hard to imagine® a situation where we would prefer an orthogonal signal set to a bi- 
orthogonal set, since one extra bit per signal is achieved at essentially no cost. However, for the 
limiting argument to follow, an orthogonal set is used since it is simpler to treat analytically. 
As M gets very large, the advantage of bi-orthogonal signals becomes smaller, which is why, 
asymptotically, the two are equivalent. 


8.5.3 Error probability for orthogonal signal sets 


Since the signals differ only by the ordering of the coordinates, the probability of error does 
not depend on which signal is sent; thus Pr(e) = Pr(e | A=ao). Conditional on A = ao, Yo is 
N (WE, No/2) and Ym is N(0,.No/2) for 1 <m< M-1. Note that if A=ao and Yo=yo, then 
an error is made if Y;, > yo for any m, 1 <m< M-1. Thus 


ie M-1 
Pr(e) = / Fy, a(Yo | @o) Pr ( LU Yn = yo | A= «)) dyo. (8.46) 


m=1 


The rest of the derivation of Pr(e), and its asymptotic behavior as MW gets large, is simplified 
if we normalize the outputs to Wm = Ym./2/No. Then, conditional on signal ao being sent, 
Wo is N(,/2E/No, 1) = N(a,1), where a is an abbreviation for \/2E/No. Also, conditional on 
A= ao, Wy» is N(0,1) for 1 <m< M-1. 


a M-1 
Pr(e) = i fwo|A(wo | @o) Pr ( U (Win => wo | A= «)) dwo. (8.47) 
ey, m=1 
Using the union bound on the union above, 
M-1 
Pr ( LJ (Wn = wo | A= «)) < (M —1)Q(wo). (8.48) 
m=1 


The union bound is quite tight when applied to independent quantitities that have small aggre- 
gate probability. Thus this bound will be quite tight when wo is large and M is not too large. 
When wo is small, however, the bound becomes loose. For example, for wo = 0, Q(wo) = 1/2 
and the bound in (8.48) is (IM — 1)/2, much larger than the obvious bound of 1 for any prob- 
ability. Thus, in the analysis to follow, the left side of (8.48) will be upper-bounded by 1 for 
small wo and by (M — 1)Q(wo) for large wo. Since both 1 and (M — 1)Q(wo) are valid upper 
bounds for all wo, the dividing point y between small and large can be chosen arbitrarily. It is 
chosen in what follows to satisfy 


exp(—7?/2) = 1; y=v2lInM (8.49) 


It might seem more natural in light of (8.48) to replace 7 above by the 7 that satisfies (IM — 
1)Q(71) = 1, and that turns out to be the natural choice in the lower bound to Pr(e) developed 


8One possibility is that at passband a phase error of 7 can turn @m, into —am. Thus with bi-orthogonal signals 
it is necessary to track phase or use differential phase. 
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in Exercise 8.10. It is not hard to see, however, that y/71 goes to 1 as M — ov, so the difference 
is not of major importance. Splitting the integral in (8.47) into wo < y and wo > 74, 


Pr(e) < i Pe a Cas Maas: / * Fai Gis PaO lie) diy (850) 
thee : 
< Qa—1)+ [ Fjalwo | a9)(M—1)Q() exp E t] dwy (8.51) 
on ae 2 
lore) w a 2 2 we 
< Q(a-y)+ i = exp | (wo a ti dwo (8.52) 
Yy T 
oo a eee ON ey eee 
= Q(a—7) / Tes | ZA: Py mies e| dwo (8.53) 
a 2 a2 
= Q(a-7)+ 30 (v2 (7-5) exp a = “ (8.54) 


The first term on the right side of (8.50) is the lower tail of the distribution of Wo, and is the 
probability that the negative of the fluctuation of Wo exceeds a—7, i.e., Q(a—vy). In the second 
term, Q(wo) is upper bounded using Exercise 8.7c, thus resulting in (8.51). This is simplified 
by (M —1)Q(7) < M exp(—y?/2) = 1, resulting in (8.52). The exponent is then manipulated to 
‘complete the square’ in (8.53), leading to an integral of a Gaussian density, as given in (8.54). 
The analysis now breaks into three special cases, the first where a < y, the second where 
a/2 <y <a, and the third where y < a/2. We explain the significance of these cases after 
completing the bounds. 

Case (1): (a < 7) The argument of the first Q function in (8.53) is less than or equal to 0, 
so its value lies between 1/2 and 1. This means that Pr(e) < 1/2, which is a useless result. As 
seen later, this is the case where the rate is greater than or equal to capacity. It is also shown 
in Exercise 8.10 that the error probability must be large in this case. 

Case (2): (a/2 < y < a) Each Q function in (8.53) has a non-negative argument, so the bound 
Ola) s 1 exp(==) applies (see Exercise 8.7b). 


Pr(e) < exp |= e wv ! sao (= ! z (y a/2)*) (8.55) 
2 2 (2) <p (102) i 


Note that (8.56) follows (8.55) from combining the terms in the exponent of the second term. 
The fact that exponents are equal is not too surprising, since y was chosen to approximately 
equalize the integrands in (8.50) at wo = 4. 


Case (3): (y < a/2) The argument of the second Q function in (8.53) is less than or equal to 
0, so its value lies between 1/2 and 1 and is upper bounded by 1, yielding 


Pr(e) < se eee + so [=?- " r] (8.57) 
< exp (= 4 r) (8.58) 


Since the two exponents in (8.55) are equal, the first exponent in (8.57) must be smaller than 
the second, leading to (8.58). This is essentially the union bound derived in Exercise 8.8. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


8.5. ORTHOGONAL SIGNAL SETS AND SIMPLE CHANNEL CODING 275 


The lower bound in Exercise 8.10 shows that these bounds are quite tight, but the sense in 
which they are tight will be explained later. 


We now explore what a and y are in terms of the number of codewords M and the energy per 
bit, Ey. Recall that a = \/2E/No. Also log, M = 6b where b is the number of bits per signal. 
Thus a = \/2bE,/No. From (8.49), y? = 2In M = 2b1n(2). Thus 


a—y = V2 | VEv/No = vin2| 


Substituting these values into (8.56) and (8.58), 


2 Ei E 
Pr(e) < exp|—b (VEs/No —vln 2) fol. ce Oe (8.59) 
4Np No 
Ey Ee 
< — —— — ee : 
Pr(e) < exp | (et in?) for In2< INc (8.60) 


We see from this that if E,/No > In2, then as b — oo holding Ey constant, Pr(e) — 0. 


Recall that in (7.86), we stated that the capacity (in bits per second) of a WGN channel of 
bandwidth W, noise spectral density No/2, and power P is 


P 
C=W1 1+ —— }. 8.61 
7 ( - a) re 
With no bandwidth constraint, 7.e., in the limit W — oo, the ultimate capacity is C = ant 
This means that, according to Shannon’s theorem, for any rate R < C= ee! there are codes 


of rate R bits per second for which the error probability is arbitrarily close to 0. Now P/R = E), 


so Shannon says that if sting > 1, then codes exist with arbitrarily small error. 


The orthogonal codes provide a concrete proof of this ultimate capacity result, since (8.59) shows 
that Pr(e) can be made arbitrarily small (by increasing b) so long as ~“*, > 1. Shannon’s 


No In2 


theorem also says that the error probability can not be made small if wots <1. We have not 


quite proven that here, although Exercise 8.10 shows that the error probability cannot be made 


arbitrarily small for an orthogonal code? if aie <i. 


The limiting operation here is slightly unconventional. As b increases, Ey is held constant. This 
means that the energy F in the signal increases linearly with b, but the size of the constellation 
increases exponentially with b. Thus the bandwidth required for this scheme is infinite in the 
limit, and going to infinity very rapidly. This means that this is not a practical scheme for 
approaching capacity, although sets of 64 or even 256 bi-orthogonal waveforms are used in 
practice. 


The point of the analysis, then, is first to show that this infinite bandwidth capacity can be ap- 
proached, but second to show also that using large but finite sets of orthogonal (or bi-orthogonal 
or simplex) waveforms does decrease error probability for fixed signal to noise ratio, and decreases 
it as much as desired (for rates below capacity) if enough bandwidth is used. 


°Since a simplex code has the same error probability as the corresponding orthogonal code, but differs in 
energy from the orthogonal code by a vanishingly small amount as M — oo, the error probability for simplex 
codes also cannot be made arbitrarily small for any given vhs less than 1. It is widely believed, but never 
proven, that simplex codes are optimal in terms of ML error probability whenever the error probability is small. 
There is a known example, however, [30], for all M > 7, where the simplex is non-optimal, but in this example, 


the signal to noise ratio is very small and the error probability is very large. 
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The different forms of solution in (8.59) and (8.60) are interesting, and not simply consequences 
of the upper bounding used. For case (2), which leads to (8.59), the typical way that errors 
occur is when wo © ¥y. In this situation, the union bound is on the order of 1, which indicates 
that, conditional on yo % 47, it is quite likely that an error will occur. In other words, the typical 
error event involves an unusually large negative value for wo rather than any unusual values for 
the other noise terms. In case (3), which leads to (8.60), the typical way for errors to occur is 
when wo © a/2 and when some other noise term is also at about a/2. In this case, an unusual 
event is needed both in the signal direction and in some other direction. 


A more intuitive way to look at this distinction is to visualize what happens when E’/No is held 
fixed and M is varied. Case 3 corresponds to small M, case 2 to larger M, and case 1 to very 
large M. For small M, one can visualize the Voronoi region around the transmitted signal point. 
Errors occur when the noise carries the signal point outside the Voronoi region, and that is most 
likely at the points in the Voronoi surface closest to the transmitted signal, i.e., at points half 
way between the transmitted point and some other signal point. As M increases, the number 
of these midway points increases until one of them is almost certain to cause an error when the 
noise in the signal direction becomes too large. 


8.6 Block Coding 


This section provides a brief introduction to the subject of coding for error correction on noisy 
channels. Coding is a major topic in modern digital communication, certainly far more major 
than suggested by the length of this introduction. In fact, coding is a topic that deserves its 
own text and its own academic subject in any serious communication curriculum. Suggested 
texts are [6] and [15]. Our purpose here is to provide enough background and examples to 
understand the role of coding in digital communication, rather than to prepare the student for 
coding research. We start by viewing orthogonal codes as block codes using a binary alphabet. 
This is followed by the Reed-Muller codes, which provide considerable insight into coding for 
the WGN channel. This then leads into Shannon’s celebrated noisy-channel coding theorem. 


A block code is a code for which the incoming sequence of binary digits is segmented into blocks 
of some given length m and then these binary m-tuples are mapped into codewords. There 
are thus 2™ codewords in the code; these codewords might be binary n-tuples of some block 
length n > m, or might be vectors of signals, or might be waveforms. There is no fundamental 
difference between coding and modulation; for example the orthogonal code above with M = 2™ 
codewords can be viewed either as modulation with a large signal set or coding using binary 
m-tuples as input. 


8.6.1 Binary orthogonal codes and Hadamard matrices 


When orthogonal codewords are used on a WGN channel, any orthogonal set is equally good from 
the standpoint of error probability. One possibility, for example, is the use of orthogonal sine 
waves. From an implementation standpoint, however, there are simpler choices than orthogonal 
sine waves. Conceptually, also, it is helpful to see that orthogonal codewords can be constructed 
from binary codewords. This digital approach will turn out to be conceptually important in 
the study of fading channels and diversity in the next chapter. It also helps in implementation, 
since it postpones the point at which digital hardware gives way to analog waveforms. 
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One digital approach to generating a large set of orthogonal waveforms comes from first gener- 
ating a set of M binary codewords, each of length M and each distinct pair differing in exactly 
M/2 places. Each binary digit can then be mapped into an antipodal signal, 0 — +a and 
1 — -a. This yields an M-tuple of real-valued antipodal signals, 51,...,5,¢, which is then 
mapped into the waveform }/, s;;(t) where {¢;(t);1<j<M} is an orthonormal set (such as 
sinc functions or Nyquist pulses). Since each pair of binary codewords differs in M/2 places, the 
corresponding pair of waveforms are orthogonal and each waveform has equal energy. A binary 
code with the above properties is called an binary orthogonal code. 


There are many ways to generate binary orthogonal codes. Probably the simplest is from a 
Hadamard matrix. For each integer m > 1, there is a 2™ by 2” Hadamard matrix H,,. Each 
distinct pair of rows in the Hadamard matrix H,, differs in exactly 2”’~! places, so the 2” rows 
of H,, constitute an binary orthogonal code with 2” codewords. 


It turns out that there is a simple algorithm for generating the Hadamard matrices. The 
Hadamard matrix Hy, is defined to have the rows 00 and 01 which trivially satisfy the con- 
dition that each pair of distinct rows differ in half the positions. For any integer m > 1, the 
Hadamard matrix H,,1 of order 2™+! can be expressed as four 2™ by 2™ submatrices. Each of 
the upper two submatrices is Hj, and the lower two submatrices are Hy and Hm, where Hm 
is the complement of H,,. This is illustrated in Figure 8.7 below. 


0000 0000 

ololfolo 0101 0101 
cane 0011 0011 

O}1],0]1 0110 0110 

aid ofollili 0000 1111 
ailaice 0101 1010 

0011 1100 

0110 1001 


Figure 8.7: Hadamard Matrices. 


Note that each row of each matrix in Figure 8.7, other than the all zero row, contains half zeroes 
and half ones. To see that this remains true for all larger values of m, we can use induction. 
Thus assume, for given m, that Hy, contains a single row of all zeros and 2” — 1 rows, each 
with exactly half ones. To prove the same for H,,,1, first consider the first 2” rows of Hy4+1. 
Each row has twice the length and twice the number of ones as the corresponding row in Hy). 
Next consider the final 2” rows. Note that H,, has all ones in the first row and 2”’~! ones in 
each other row. Thus the first row in the second set of 2” rows of H,,41 has no ones in the first 
2” positions and 2” ones in the final 2” positions, yielding 2 ones in 2+! positions. Each 
remaining row has 2”~! ones in the first 2” positions and 2”! ones in the final 2” positions, 
totaling 2™ ones as required. 

By a similar inductive argument (See Exercise 8.18), the mod-2 sum!? of any two rows of Hyp, 
is another row of H,,. Since the mod-2 sum of two rows gives the positions in which the rows 
differ, and only the mod-2 sum of a codeword with itself gives the all 0 codeword, this means 
that the set of rows is a binary orthogonal set. 


The fact that the mod-2 sum of any two rows is another row makes the corresponding code a 


The mod-2 sum of two binary numbers is defined by 0@0 = 0,0@1=1,1@0=1, and1@1=0. The 
mod-2 sum of two rows (or vectors) or binary numbers is the component-wise row (or vector) of mod-2 sums. 
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special kind of binary code called a linear code, parity-check code or group code (these are all 
synonyms). Binary M-tuples can be regarded as vectors in a vector space over the binary scalar 
field. It is not necessary here to be precise about what a field is; so far it has been sufficient 
to consider vector spaces defined over the real or complex fields. However, the binary numbers, 
using mod-two addition and ordinary multiplication, also form a field and the familiar properties 
of vector spaces apply here also. 


Since the set of codewords in a linear code is closed under mod-2 sums (and also closed under 
scalar multiplication by 1 or 0), a linear code is a binary vector subspace of the binary vector 
space of binary M-tuples. An important property of such a subspace, and thus of a linear 
code, is that the set of positions in which two codewords differ is the set of positions in which 
the mod-2 sum of those codewords contains 1’s. This means that the distance between two 
codewords (%.e., the number of positions in which they differ) is equal to the weight (the number 
of positions containing 1’s) of their mod-2 sum. This means, in turn, that for a linear code, the 
minimum distance dmin, taken between all distinct pairs of codewords, is equal to the minimum 
weight (minimum number of l’s) of any non-zero codeword. 


Another important property of a linear code (other than the trivial code consisting of all binary 
M-tuples) is that some components x; of each codeword x = (x1,... ,2\z)' can be represented 
as mod-2 sums of other components. For example, in the m = 3 case of Figure 8.7, 74 = 172 @23, 
t= 12045, 17 = 13 PX5, LE = T4675, and x; = O, Thus only 3 of the components can 
be independently chosen, leading to a 3-dimensional binary subspace. Since each component is 
binary, such a 3-dimensional subspace contains 2? = 8 vectors. The components that are mod-2 
combinations of previous components are called ‘parity checks’ and often play an important role 
in decoding. The first component, «1, can be viewed as a parity check since it cannot be chosen 
independently, but its only role in the code is to help achieve the orthogonality property. It is 
irrelevant in decoding. 


It is easy to modify the binary orthogonal code to generate a binary simplex code, 7.e., a binary 
code which, after the mapping 0 — a,1 — —a, forms a simplex in Euclidean space. The first 
component of each binary codeword is dropped, turning the code into M codewords over M — 1 
dimensions. Note that in terms of the antipodal signals generated by the binary digits, dropping 
the first component converts the signal +a (corresponding to the first binary component 0) into 
the signal 0 (which corresponds neither to the binary 0 or 1) . The generation of the binary 
biorthogonal code is equally simple; the rows of H,, yield half of the codewords and the rows 
of Hm yield the other half. Both the simplex and the biorthogonal code, as expressed in binary 
form here, are linear binary block codes. 


Two things have been accomplished with this representation of orthogonal codes. First, orthog- 
onal codes can be generated from a binary sequence mapped into an antipodal sequence, and 
second, an example has been given where modulation over a large alphabet can be viewed as a 
binary block code followed by modulation over a binary or very small alphabet. 


8.6.2 Reed-Muller codes 


Orthogonal codes (and the corresponding simplex and biorthgonal codes) use enormous band- 
width for large M. The Reed-Muller codes constitute a class of binary linear block codes that 
include large bandwidth expansion (in fact they include the binary biorthogonal codes) but also 
allow for much smaller bandwidth expansion, 7.e., they allow for binary codes with M codewords 
where log M is much closer to the number of dimensions used by the code. 
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The Reed-Muller codes are specified by two integer parameters, m > 1 and 0 < r < m; a binary 
linear block code, denoted RM(r,m), exists for each such choice. The parameter m specifies the 
block length to be n = 2”. The minimum distance din(r,m) of the code and the number of 
binary information digits k(r,m) required to specify a codeword are given by 


dmin(1,m) = 2-7 k(r,m) = 3 @ (8.62) 


j=o \4 


where (”") = CEE Thus these codes, like the binary orthogonal codes, exist only at block 
lengths equal to a power of 2. While there is only one binary orthogonal code (as defined through 
H,,) for each m, there is a range of RM codes for each m ranging from large din and small k 


to small din and large k as r increases. 


For each m, these codes are trivial for r = 0 and r = m. For r = 0 the code consists of two 
codewords selected by a single bit, so k(0,m) = 1; one codeword is all 0’s and the other is all 
1’s, leading to dmin(0,m) = 2. For r = m, the code is the set of all binary 2” tuples, leading 
to dmin(m,m) = 1 and k(m,m) = 2™. For m = 1, then, there are two RM codes. RM(0, 1) 
consists of the two codewords (0,0) and (1,1), and RM(1, 1) consists of the four codewords (0,0), 
(0,1), (1,0), and (1,1). 


For m > 1 and intermediate values of r, there is a simple algorithm, much like that for Hadamard 
matrices, that specifies the set of codewords. The algorithm is recursive, and, for each m > 1 
and 0 < r < m, specifies RM(r,m) in terms of RM(r,m—1) and RM(r—1,m-—1). Specifically, 
z € RM(r,m) if a is the concatenation of u and u @ v, denoted = (u,u @ v), for some 
u € RM(r,m—1,) and v € RM(r—1,m-—1). More formally, for 0 < r < m, 


RM(r,m) = {(u,u@ v) | u € RM(r, m—-1), v € RM(r—-1, m-1)}. (8.63) 


The analogy with Hadamard matrices is that x is a row of Hm if u is a row of H»_, and v is 
either all ones or all zeros. 


The first thing to observe about this definition is that if RM(r,m—1) and RM(r—1,m-—1) are 
linear codes, then RM(r,m) is also. To see this, let = (u,u@v) and a’ = (u’,u’ Sv’). Then 


eo2’=(udu,udu OovOev) =(u",u" Ov") 


where u” = u@u’ € RM(r,m—1) and v” = vSv' € RM(r—1, m—1). This shows that Ga’ € 
RM(r,m), and it follows that RM(r,m) is a linear code if RM(r,m—1) and RM(r—1, m—1) are. 
Since both RM(0,m) and RM(m,m) are linear for all m > 1, it follows by induction on m that 
all the Reed-Muller codes are linear. 


Another observation is that different choices of the pair u and v cannot lead to the same value 
of « = (u,u@v). To see this, let 2’ = (u’,v’). Then if wu 4 wu’, it follows that the first half 
of a differs from that of x’. Similarly if wu = u’, and v ¥ v’, then the second half of x differs 
from that of x’. Thus x = 2’ only if both wu = u’ and v = v’. As a consequence of this, the 
number of information bits required to specify a codeword in RM(r,m), denoted k(r,m) is equal 
to the number required to specify a codeword in RM(r,m-—1) plus that to specify a codeword 
in RM(r—1, m—1), «.e., for0 <r <™m, 


k(r,m) = k(r,m—1) + k(r—1, m—1) 


Exercise 8.19 shows that this relationship implies the explicit form for k(r,m) given in (8.62). 
Finally Exercise 8.20 verifies the explicit form for dmin(r,m) in (8.62). 
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The RM(1,m) codes are the binary bi-orthogonal codes and one can view the construction in 
(8.63) as being equivalent to the Hadamard matrix algorithm by replacing the M by M matrix 


AZ, in the Hadamard algorithm by the 2M by M matrix Es where Gi, = Hy. 


Another interesting case is the RM(m — 2,m) codes. These have dmin(m—2,m) = 4 and 
k(m—2,m) = 2™ — m —1 information bits. In other words, they have m+ 1 parity checks. 
As explained below, these codes are called extended Hamming codes. A property of all RM 
codes is that all codewords have an even number!! of 1’s and thus the last component in each 
codeword can be viewed as an overall parity check which is chosen to ensure that the codeword 
contains an even number of 1’s. 


If this final parity check is omitted from RM(m — 2,m) for any given m, the resulting code is 
still linear and must have a minimum distance of at least 3, since only one component has been 
omitted. This code is called the Hamming code of block length 2” — 1 with m parity checks. It 
has the remarkable property that every binary 2” — 1 tuple is either a codeword in this code or 
distance 1 from a codeword!?. 


The Hamming codes are not particularly useful in practice for the following reasons. If one uses 
a Hamming code at the input to a modulator and then makes hard decisions on the individual 
bits before decoding, then a block decoding error is made whenever 2 or more bit errors occur. 
This is a small improvement in reliability at a very substantial cost in transmission rate. On the 
other hand, if soft decisions are made, using the extended Hamming code (7.e., RM(m-—2,m) 
extends dmin from 3 to 4, greatly decreasing the error probability with a marginal cost in added 
redundant bits. 


8.7 The noisy-channel coding theorem 


The previous sections provided a brief introduction to coding. It provided several examples 
showing that the use of binary codes could accomplish the same thing, for example, as the use 
of large sets of orthogonal, simplex, or bi-orthogonal waveforms. There was an ad hoc nature to 
the development, however, illustrating a number of schemes with various interesting properties, 
but little in the way of general results. 

The earlier results on Pr(e) for orthogonal codes were more fundamental, showing that Pr(e) 
could be made arbitrarily small for a WGN channel with no bandwidth constraint if ee is 
greater than In2. This constituted a special case of the noisy-channel coding theorem, saying 
that arbitrarily small Pr(e) can be achieved for that very special channel and set of constraints. 


8.7.1 Discrete memoryless channels 


This section states and proves the noisy-channel coding theorem for another special case, that 
of discrete memoryless channels (DMC’s). This may seem a little peculiar after all the emphasis 
in this and the last chapter on WGN. There are two major reasons for this choice. The first is 
that the argument is particularly clear in the DMC case, particularly after studying the AEP for 


'l'This property can be easily verified by induction. 

12To see this, note that there are gem codewords, and each codeword has 2” — 1 neighbors; these are 
distinct from the neighbors of other codewords since dmin is at least 3. Adding the codewords and the neighbors, 
we get the entire set of 2?"—1 vectors. This also shows that the minimum distance is exactly 3. 
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discrete memoryless sources. The second is that the argument can be generalized easily, as will 
be discussed later. A DMC has a discrete input sequence X = Xj,...,Xzx,.... At each discrete 
time k, the input to the channel belongs to a finite alphabet Y of symbols. For example, in 
the last section, the input alphabet could be viewed as the signals ta. The question of interest 
would then be whether it is possible to communicate reliably over a channel when the decision 
to use the alphabet V = {a,—a} has already been made. The channel would then be regarded 
as the part of the channel from signal selection to an output sequence from which detection 
would be done. In a more general case, the signal set could be an arbitrary QAM set. 


A DMC is also defined to have a discrete output sequence Y = Yj,... ,¥,,... , where each 
output Y; in the output sequence is a selection from a finite alphabet Y and is a probabilistic 
function of the input and noise in a way to be described shortly. In the example above, the 
output alphabet could be chosen as Y = {a,—a} corresponding to the case in which hard 
decisions are made on each signal at the receiver. The channel would then include the modulation 
and detection as an internal part, and the question of interest would be whether coding at 
the input and decoding from the single-letter hard decisions at the output could yield reliable 
communication. 


Another choice would be to use the pre-decision outputs, first quantized to satisfy the finite 
alphabet constraint. Another, almost identical choice, would be a detector that produced a 
quantized LLR as opposed to a decision. 


In summary, the choice of discrete memoryless channel alphabets depends on what part of the 
overall communication problem is being addressed. 


In general, a channel is described not only by the input and output alphabets but also the 
probabilistic description of the outputs conditional on the inputs (the probabilistic description 
of the inputs is selected by the channel user). Let X” = (X1, X2,...X;»)' be the channel input, 
here viewed either over the lifetime of the channel or any time greater than or equal to the 
duration of interest. Similarly the output is denoted by Y” = (Yj,...,Yn,). For a DMC, the 
probability of the output n-tuple, conditional on the input n-tuple, is defined to satisfy 


n 
Dynjyn Yty-++ Yr | @15+-+ 52m) = [] Prix, (wale) (8.64) 
k=] 


where Pe deh (yz = j|tr = 2), for each j € Y andi € & is a function only of 7 and j and not of 
the time k. Thus, conditional on a given input sequence, the output symbols are independent 
and each has a conditional distribution depending only on the corresponding input symbol. This 
conditional distribution is denoted P;,; for all i € V and j € J, «e., Dy Ge = seed) = FG. 
Thus the channel is completely described by the input alphabet, the output alphabet, and the 
conditional distribution function P;,;. The conditional distribution function is usually called the 
transition function or matrix. 


The most intensely studied DMC over the past 60 years is the binary symmetric channel (BSC), 
which has XY = {0,1}, ¥ = {0,1} and satisfies Po; = Pio. This single number P5,; thus specifies 
the BSC. The WGN channel with antipodal inputs and ML hard decisions at the output is an 
example of the BSC. Despite the intense study of the BSC and its inherent simplicity, the ques- 
tion of optimal codes of long block length (optimal in the sense of minimum error probability) is 
largely unanswered. Thus, the noisy-channel coding theorem, which describes various properties 
of the achievable error probability through coding plays a particularly important role in coding. 
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8.7.2 Capacity 


The capacity C' of a DMC is defined in this subsection. The following subsection, after defining 
the rate R at which information enters the modulator, shows that reliable communication is 
impossible on a channel if R > C. This is known as the converse to the noisy-channel coding 
theorem, and is in contrast to the final subsection which shows that arbitrarily reliable commu- 
nication is possible for any R < C. As in the analysis of orthogonal codes, communication at 
rates below capacity can be made increasingly reliable with increasing block length, while this 
is not possible for R > C. 


The capacity is defined in terms of various entropies. For a given DMC and given sequence length 
n, let Dyn xn(y"|x”) be given by (8.64) and let p,,(#@") denote an arbitrary probability mass 
function chosen by the user on the input Xj....,X,. This leads to a joint entropy H[X" Y”]. 
From (2.37), this can be broken up as 


H[X”¥"] =H[X"]+H[¥"|X"], (8.65) 


where H[Y¥"|X"] = E[—logpynjyn(¥"|X")]. Note that because H[Y"|X”] is defined as an 
expectation over both X” and Y”, H[Y"|X"”] depends on the distribution of X” as well as the 
conditional distribution of Y” given X”. The joint entropy H[X" Y”] can also be broken up 
the opposite way as 


H[X"¥"] =H[/Y"]+H[X"|¥", (8.66) 


Combining (8.65) and (8.66),it is seen that H[X"] — H[X”"| Y”"] = H[Y"] —H[Y"|X"]. This 
difference of entropies is called the mutual information between X” and Y” and denoted 
I(x"; Y”), so 


TX"; Y") =H[X”] —H[X"|Y"| =H[Y"] —H[Y"|X”] (8.67) 


The first expression for [(X"; Y”) has a nice intuitive interpretation. H|X™”] is understood 
from source coding as representing the number of bits required to represent the channel input. 
If we look at a particular sample value y” of the output, H[X"| Y"=y"] can be interpreted as 
the number of bits required to represent X” after observing the output sample value y”. Note 
that H[X"| Y”] is the expected value of this over Y”". Thus [(X"; Y”) can be interpreted as 
the reduction in uncertainty, or number of required bits for specification, after passing through 
the channel. This intuition will lead to the converse to the noisy-channel coding theorem in the 
next subsection. 


The second expression for [(X"; Y") is the one most easily manipulated. Taking the log of the 
expression in (8.64), 


n 
H[¥"|X"] = So HY) Xe (8.68) 
k=1 


Since the entropy of a sequence of random symbols is upper bounded by the sum of the corre- 
sponding terms (see Exercise 2.19) 


H[Y"] < y H[Ye] (8.69) 


k=1 
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Substituting this and (8.68) in (8.67), 
TX) = S" 1G) (8.70) 
k=1 


If the inputs are independent, then the outputs are also and (8.69) and (8.70) are satisfied 
with equality. The mutual information I(X;,; Y;,) at each time k is a function only of the pmf 
for X;,, since the output probabilities conditional on the input are determined by the channel. 
Thus, each mutual information term in (8.70) is upper bounded by the maximum of the mutual 
information over the input distribution. This maximum is defined as the capacity of the channel, 


C= max pF; lg = (8.71) 
SPP sles iy 


where p = (po,pi,---,Px—1) is the set (over the alphabet 4’) of input probabilities. The 
maximum is over this set of input probabilities, subject to p; > 0 for eachi € ¥ and Uc y pi = 1. 
The above function is concave in p, and thus the maximimization is straight-forward; for the 
BSC, for example, the maximum is at pp = pi = 1/2 and C = 1+ Po, log Poi + Poo log Poo. 
Since C upper bounds I(X;;Y;) for each k, with equality if the distribution for X; is the 
maximizing distribution, 


EXP SY) on, (8.72) 


with equality if all inputs are independent and chosen with the maximizing probabilities in 
(8.71). 


8.7.3 Converse to the noisy-channel coding theorem 


Define the rate R for the DMC above as the number of iid equiprobable binary source digits 
that enter the channel per channel use. More specifically assume that nR bits enter the source 
and are transmitted over the n channel uses under discussion. Assume also that these bits are 
mapped into the channel input X” in a one-to-one way. Thus H[X"] = nR and X” can take on 
M = 2”® equiprobable values. The following theorem now bounds Pr(e) away from 0 if R > C. 


Theorem 8.7.1. Consider a DMC with capacity C. Assume that the rate R satisfies R > C. 
Then for any block length n, the ML probability of error, i.e., the probability that the decoded 
n-tuple X’ is unequal to the transmitted n-tuple X", is lower bounded by 


R-—C < H,(Pr(e)) + RPr(e), (8.73) 
where Hy(a) is the binary entropy, —aloga — (1 — a) log(1— a). 


Remark: The right hand side of (8.73) is 0 at Pr(e) = 0 and is increasing for Pr(e) < 1/2, so 
(8.73) provides a lower bound to Pr(e) that depends only on C and R. 


Proof: Note that H[X”] = nR and, from (8.70) and (8.67), H(X”") — H(X"|Y") < nC. Thus 
H(X"|¥") > nR- nC. (8.74) 


For each sample value y” of Y", H(X” | Y”"=y”) is an ordinary entropy. The received y” 
is decoded into some &” and the corresonding probability of error is Pr(X”" 4 z” | Y"=y”). 
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As in Exercise 2.20, the entropy H(X” | Y"=y”) can be upper bounded as the sum of two 
terms, first the binary entropy of whether or not X” = x”, and second, the entropy of all M—1 
possible errors in the case X”" 4 &", i.e., 


H(X | Y"=y") < Hy(Pr(ely")) + Pr(ely”) log(M — 1). 
Upper bounding log(M — 1) by log M = nR and averaging over Y", 
H(X"|¥") < H,(Pr(e)) + nR Pr(e). (8.75) 


Combining (8.74 and (8.75), 


R-C< 


H(Pr(e)) ais RPr(e) 


and upper bounding 1/n by 1 yields (8.73). 


The above theorem is not entirely satisfactory, since it shows that block errors cannot be made 
negligible at rates above capacity, but does not rule out the possibility that each block error 
causes only one bit error, say, and thus the probability of bit error might go to 0 as n — oo. As 
shown in Theorem 4.3.4 of [7], this cannot happen, but the proof doesn’t add much insight and 
will be omitted here. 


8.7.4 noisy-channel coding theorem, forward part 


There are two critical ideas in the forward part of the coding theorem. The first is to use the 
AEP on the joint ensemble X" Y”. The second, however, is what shows the true genius of 
Shannon. His approach, rather than an effort to find and analyze good codes, was to simply 
choose each codeword of a code randomly, choosing each letter in each codeword to be iid with 
the capacity yielding input distribution. 


One would think initially that the codewords should be chosen to be maximally different in 
some sense, but Shannon’s intuition said that independence would be enough. Some initial 
sense of why this might be true comes from looking at the binary orthogonal codes. Here each 
codeword of length n differs from each other codeword in n/2 positions, which is equal to the 
average number of differences with random choice. Another initial intuition comes from the 
fact that mutual information between input and output n-tuples is maximized by iid inputs. 
Truly independent inputs do not allow for coding constraints, but choosing a limited number of 
codewords using an iid distribution is at least a plausible approach. In any case, the following 
theorem proves that this approach works. 


It clearly makes no sense for the encoder to choose codewords randomly if the decoder doesn’t 
know what those codewords are, so we visualize the designer of the modem as choosing these 
codewords and building them into both transmitter and receiver. Presumably the designer 
is smart enough to test her code before shipping a million copies around the world, but we 
won't worry about that. We simply average the performance over all random choices. Thus 
the probability space consists of M independent iid codewords of block length n, followed by 
a randomly chosen message m, 0 < m < M —1 that enters the encoder. The corresponding 
sample value x}, of the mth randomly chosen codeword is transmitted and combined with noise 
to yield a received sample sequence y”. The decoder then compares y” with the M possible 
randomly chosen messages (the decoder knows af,... , @j;_,, but doesn’t know m) and chooses 
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the most likely of them. It appears that a simple problem has been replaced with a complex 
problem, but since there is so much independence between all the random symbols, the new 
problem is surprisingly simple. 

These randomly chosen codewords and channel outputs are now analyzed with the help of the 
AEP. For this particular problem, however, it is simpler to use a slightly different form of AEP, 
called the strong AEP, than that of Chapter 2. The strong AEP was analyzed in Exercise 2.28 
and is reviewed here. Let U" = Uj,... ,U, be an n-tuple of iid discrete random symbols with 
alphabet U and letter probabilities p; for each 7 € U. Then for any ¢ > 0, the strongly typical 
set S-(U”) of sample n-tuples is defined as 


Nj(u" 
5.(0") = {un pale). caw <pj(lt+e); for allj € ul, (8.76) 
n 


where N;(w”) is the number of appearances of letter j in the n-tuple u”. The double inequality 
in (8.76) will be abbreviated as Nj(u”) = np;(1 + €), so (8.76) becomes 


S-(U") ={u": Nj(u") =np;(1te); for allj €U} (8.77) 


Thus the strongly typical set is the set of n-tuples for which each letter appears with ap- 
proximately the right relative frequency. For any given ¢, the law of large numbers says that 
limp—oo Pr(Nj(U") = pj(1 + ¢)) = 1 for each j. Thus (see Exercise 2.28) 


lim Pr(U" € $.(U")) =1. (8.78) 


nN CO 


Next consider the probability of n-tuples in S.(U"). Note that p,,(u") = |], ge Taking 
the log of this, 


log pyn(u") = —nH(U)(1 + €) for u" € S.(U"). (8.79) 


Thus the strongly typical set has the same basic properties as the typical set defined in Chapter 
2. Because of the requirement that each letter has a typical number of appearances, however, it 
has additional properties that are useful in the coding theorem below. 

Consider an n-tuple of channel input/output pairs, X" Y" = (X1Y}), (X2Y2),... ,(XnYn) where 
successive pairs are iid. For each pair, XY, let X have the pmf {p;;i € ¥} which achieves 
capacity in (8.71). Let the pair XY have the pmf {p;P;j;;i € ¥,7 € Y} where P;,; is the channel 
transition probability from input i to output 7. This is the joint pmf for the randomly chosen 
codeword that is transmitted and the corresponding received sequence. 


The strongly typical set S:(X" Y™”) is then given by (8.77) as 


S(X"Y") = {a"y”: Niy(x"y") =npPilte); for allie ¥,j eV} (8.80) 


where Nj;(x2"y") is the number of xy pairs in ((7141), (t2y2),--- ,(@nYn)) for which « = i and 
y = Jj. The transmitted codeword X” and the received n-tuple Y” then satisfy 


jim, Prix’ yy") eS (XY )) = 1. (8.81) 
log Dynyn(@”y”) = —nH(XY)(1 + ¢) for (2"y") € S-(X”"Y”). (8.82) 
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The nice feature about strong typicality is that if x” y” is in the set S.(X"Y"), then x” must 
be in S:(X”) and y” must be in S:(Y”). To see this, assume that (a”", y”) € S-(X” Y”). Then 


Ni(2") = So Ni(e"y”) 
j 


€ S> npiPij(1 + €) = npi(1 £ ) for all i 
j 


Thus 2” € S.(X"). The same argument shows that y” € S-(Y”). 


The noisy-channel coding theorem can now be stated and proved. 


Theorem 8.7.2. Consider a DMC with capacity C and let R be any fixed rate R < C. Then 
for any 6 > 0, and all sufficiently large block lengths n, there exist block codes with M > 2”® 
equiprobable codewords such that the ML error probability satisfies Pr(e) < 6. 


Proof: As suggested above, we consider the error probability averaged over the random selection 
of codes defined above, where for given block length n and rate R, the number of codewords 
will be M = [2"). Since at least one code must be as good as the average, the theorem can be 
proved by showing that Pr(e) < 6. 


The decoding rule to be used will be different than maximum likelihood, but since ML is opti- 
mum, proving that Pr(e) < 6 for any decoding rule will prove the theorem. The rule to be used 
is strong typicality. That is, for given € to be selected later, the decoder, given y”, determines 
whether there is an m for which the pair (x7, y”) lies in S:(X" Y”). If there is exactly one m 
satisfying this test, that is the decoded message; that decoded message is in error, of course, if 
m. differs from the transmitted message m. If no m or multiple m satisfy the above test, the 
decoding is also counted as an error, so the actual decoded value in these cases is immaterial for 
the proof. The probability of error, given any transmitted message m, is then upper bounded 
by two terms, first, Pr(X”Y”" ¢ S.(X"Y")) where X" Y” is the transmitted/received pair, 
and second, the probability that some other codeword is jointly typical with Y”. The other 
codewords are independent of Y” and each is chosen with iid symbols using the same pmf as 
the transmitted codeword. Let X” be any one of these codewords. Using the union bound, 


Pr(e) < Pr((X"Y") ¢ S.(X"”Y")) + (M —1)Pr((X”" Y¥”) € 8.(X"¥")) (8.83) 


For any large enough n, (8.81) shows that the first term is at most 6/2. Also M—1 < 2", 
Thus 


Pr(e) < ° 4+ 2P8 pr((X"¥") € $.(X"¥")) (8.84) 


To analyze the second term above, define F'(y”) as the set of input sequences x” that are jointly 
typical with the given y”. This set is empty if y” ¢ S-(Y”). Note that for y” € S-(Y”), 


POT 2 Pog ae re, ee 


rr EPF(y™) rr EF(y™) 


where the final inequality comes from (8.82). Since pyn(y”) < 2-PH#C)C-*) for y” € S.(Y"), 
the conclusion is that the number of n-tuples in Fy”) for any typical y” satisfies 


IF(y”)| < gnlH(XY)(1+e)—-H(Y)(1-e)] (8.85) 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


8.7. THE NOISY-CHANNEL CODING THEOREM 287 


This means that the probability that X” lies in F(y”) is at most the size |F(y”)| times the 
maximum probability of a typical X” (recall that X” is independent of Y” but has the same 
marginal distribution as X”. Thus 
Pr((X” Y”) = Six? Y”)) < g—nlH(X)(1—-e)+H(¥) (1-e)-H(XY)(1+¢)] 
g—n{C—e[H(X)+H(Y)+H(XY)]} 


’ 


where we have used the fact that C = H(X) — H(X|Y) = H(X)+H(Y) —H(XY). Substituting 
this into (8.84), 


Pr(e) < o + gn[R-C+ea] 
where a = H(X) +H(Y)+H(XY). Finally, choosing « = (C — R)/(2a), 


Pr(e) < ° pO POA 5 


for sufficiently large n. 


The above proof is essentially the original proof given by Shannon, with a little added explanation 
of details. It will be instructive to explain the essence of the proof without any of the epsilons or 
deltas. The transmitted and received n-tuple pair (X" Y”) is typical with high probability and 
the typical pairs essentially have probability 2-"4#@") (including both the random choice of X" 
and the random noise). Each typical output y” essentially has a marginal probability gone): 
For each typical y”, there are essentially 2"4(*!¥) input n-tuples that are jointly typical with y” 
(this is the nub of the proof). An error occurs if any of these are selected to be codewords (other 
than the actual transmitted codeword). Since there are about 2"H(X) typical input n-tuples 
altogether, a fraction 2~™ (*:Y) — 9-"C of them are jointly typical with the given received y”. 


More recent proofs of the noisy-channel coding theorem also provide much better upper bounds 
on error probability. These bounds are exponentially decreasing with n with a rate of decrease 
that typically becomes vanishingly small as R — C. 


8.7.5 The noisy-channel coding theorem for WGN 


The coding theorem for DMC’s can be easily extended to discrete-time channels with arbitrary 
real or complex input and output alphabets, but doing this with mathematical generality and 
precision is difficult with our present tools. 


This is done here for the discrete time Gaussian channel, which will make clear the conditions 
under which this generalization is easy. Let X; and Y; be the input and output to the channel 
at time k, and assume that Y, = X, + Z, where Z ~ N(0,No/2) is independent of X;, and 
independent of the signal and noise at all other times. Assume the input is constrained in second 
moment to E[X?] < E, so E[Y?] < E+ No/2. 


From Exercise 3.8, the differential entropy of Y is then upper bounded by 
1 
h(Y) < 5 log(27e(E + No/2). (8.86) 


This is satisfied with equality if Y is N(0,£ + No/2), and thus if X is V(0,£). For any given 
input, ACY |X =z )\— 5 log(27eNo/2), so averaging over the input space, 


h(¥|X) = 5 log(2reNo/2). (8.87) 
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By analogy with the DMC case, let the capacity C (in bits per channel use) be defined as the 
maximum of h(Y)—h(Y|X) subject to the second moment constraint E. Thus, combining (8.86) 
and (8.87), 


G2) 15 (1 ie 4) (8.88) 


Theorem 8.7.2 applies quite simply to this case. For any given rate R in bits per channel use 
such that R < C, one can quantize the channel input and output space finely enough so that the 
corresponding discrete capacity is arbitrarily close to C' and in particular larger than R. Then 
Theorem 8.7.2 applies, so rates arbitrarily close to C' can be transmitted with arbitrarily high 
reliability. The converse to the coding theorem can also be extended. 


For a discrete time WGN channel using 2W degrees of freedom per second and a power constraint 
P, the second moment constraint on each degree of freedom!’ becomes E = P/(2W) and the 
capacity C; in bits per second becomes Shannon’s famous formula 


P 
C.=Wi 1+—— }. 8.89 
{= Wiog (14+ 7) (8.89) 
This is then the capacity of a WGN channel with input power constrained to P and degrees of 
freedom per second constrained to 2W. 


With some careful interpretation, this is also the capacity of a continuous-time channel con- 
strained in bandwidth to W and in power to P. The problem here is that if the input is strictly 
constrained in bandwidth, no information at all can be transmitted. That is, if a single bit is 
introduced into the channel at time 0, the difference in the waveform generated by symbol 1 and 
that generated by symbol 0 must be 0 before time 0, and thus, by the Paley-Wiener theorem, 
cannot be nonzero and strictly bandlimited. From an engineering perspective, this doesn’t seem 
to make sense, but the waveforms used in all engineering systems have negligible but non-zero 
energy outside the nominal bandwidth. 


Thus, to use (8.89) for a bandlimited input, it is necessary to start with the constraint that for 
any given 7 > 0, at least a fraction (1 — 7) of the energy must lie within a bandwidth W. Then 
reliable communication is possible at all rates R; in bits per second less than C; as given in (8.89). 
Since this is true for all 7 > 0, no matter how small, it makes sense to call this the capacity of 
the bandlimited WGN channel. This is not an issue in the design of a communication system, 
since filters must be used and it is widely recognized that they can’t be entirely bandlimited. 


8.8 Convolutional codes 


The theory of coding, and particularly of coding theorems, concentrate on block codes, but 
convolutional codes are also widely used and have essentially no block structure. These codes 
can be used whether bandwidth is highly constrained or not. We give an example below where 
there are two output bits for each input bit. Such a code is said to have rate 1/2 (in input bits 
per channel bit). More generally, such codes produce an m-tuple of output bits for each b-tuple 
of input bits for arbitrary integers 0 < b < m. These codes are said to have rate b/m. 


'3We were careless in not specifying whether the constraint must be satisfied for each degree of freedom or 
overall as a time-average. It is not hard to show, however, that the mutual information is maximized when the 
same energy is used in each degree of freedom. 
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A convolutional code looks very much like a discrete filter. Instead of having a single input and 
output stream, however, we have b input streams and m output streams. For the example given 
here, the number of input streams is 6 = 1 and the number of output streams is m = 2, thus 
producing two output bits per input bit. There is another difference between a convolutional 
code and a discrete filter; the inputs and outputs for a convolutional code are binary and the 
addition is modulo 2. Consider the example below in Figure 8.8. 


© om 


Information bits pals Ds 
Dz 


Dh2 


Figure 8.8: Example of a convolutional code 


For the example above, the equations for the outputs are 


Uni = Dy BDg_-1 B Dy_-2 
Uno = Dy ® Dp_2- 


Thus each of the two output streams are linear modulo two convolutions of the input stream. 
This encoded pair of binary streams can now be mapped into a pair of signal streams such 
as antipodal signals ta. This pair of signal streams can then be interleaved and modulated 
by a single stream of Nyquist pulses at twice the rate. This baseband waveform can then be 
modulated to passband and transmitted. 


The structure of this code can be most easily visualized by a “trellis” diagram as illustrated in 
Figure 8.9. 


00 


10 
State 
01 


11 


1-10 1-10 


Figure 8.9: Trellis Diagram; each transition is labeled with the input and corresponding output 


To understand this trellis diagram, note from Figure 8.8 that the encoder is characterized at 
any epoch & by the previous binary digits, D,_; and Dz_2. Thus the encoder has four possible 
states, corresponding to the four possible values of the pair D,z_ 1, Dp_2. Given any of these 
four states, the encoder output and the next state depend only on the current binary input. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


290 CHAPTER 8. DETECTION, CODING, AND DECODING 


Figure 8.9 shows these four states arranged vertically and shows time horizontally. We assume 
the encoder starts at epoch 0 with D_; = D_2=0. 


In the convolutional code of the above example, the output at epoch k depends on the current 
input and the previous two inputs. In this case, the constraint length of the code is 2. In general 
the output could depend on the input and the previous n inputs, and the constraint length is 
then defined to be n. If the constraint length is n (and a single binary digit enters the encoder 
at each epoch k), then there are 2” possible states, and the trellis diagram contains 2” nodes at 
each time instant rather than 4. 


As we have described convolutional codes above, the encoding starts at time 1 and then continues 
forever. In practice, because of packetization of data and various other reasons, the encoding 
usually comes to an end after some large number, say ko, of binary digits have been encoded. 
After D;, enters the encoder, two final 0’s enter the encoder, at epochs (kg+1) and (ko+2), and 
4 final encoded digits come out of the encoder. This restores the state of the encoder to state 0, 
which, as we see later, is very useful for decoding. For the more general case with a constraint 
length of n, we need n final zeros to restore the encoder to state 0. Altogether, ko inputs lead to 
2(ko +n) outputs, for a code rate of ko/|2(ko + n)]. Since ko is usually large relative to n, this 
is still referred to as a rate 1/2 code. Figure 8.10 below shows the part of the trellis diagram 
corresponding to this termination. 


00 


State 


11 


1—10 1-10 


Figure 8.10: Trellis Termination 


8.8.1 Decoding of convolutional codes 


Decoding a convolutional code is essentially the same as using detection theory to choose between 
each pair of codewords, and then choosing the best overall (the same as done for the orthogonal 
code). There is one slight conceptual difference in that, in principle, the encoding continues 
forever. When the code is terminated, however, this problem does not exist, and in principle 
one takes the maximum likelihood choice of all the (finite length) possible codewords. 


As usual, assume that the incoming binary digits are iid and equiprobable. This is reasonable 
if the incoming bit stream has been source encoded. This means that the codewords out to any 
given length are equally likely, which then justifies maximum likelihood (ML) decoding. 


ML detection is also used so that codes for error correction can be designed independently of the 
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source data to be transmitted. For all the codes under discussion, the error probability using 
ML decoding is independent of the transmitted codeword. Thus ML decoding is robust in the 
sense that the error probability is independent of the probability distribution of the incoming 
bits. 


Another issue, given iid inputs, is determining what is meant by probability of error. In all of 
the examples above, given a received sequence of symbols, we have attempted to choose the 
codeword that minimizes the probability of error for the entire codeword. An alternative would 
have been to minimize the probability of error individually for each binary information digit. It 
turns out to be easier to minimize the sequence error probability than the bit error probability. 
This in fact is what happens when we use ML detection between codewords, as suggested above. 


In decoding for error correction, the objective is almost invariably to minimize the sequence 
probability of error. Along with the convenience suggested above, a major reason is that a 
binary input is usually a source coded version of some other source sequence or waveform, and 
thus a single output error is often as serious as multiple errors within a codeword. ML detection 
on sequences is assumed in what follows. 


8.8.2 The Viterbi algorithm 


The Viterbi algorithm is an algorithm for performing ML detection for convolutional codes. 
Assume for the time being that the code is terminated as in Figure 8.10. It will soon be seen 
that whether or not the code is terminated is irrelevant. The algorithm will now be explained 
for the example above and for the assumption of WGN; the extension to arbitrary convolutional 
codes will be obvious except for the notational complexity of the general case. For any given 
input dj,... ,dx,, let the encoded sequence be w1,1, U1,2, U2,1, U2,2 +++ , Uko+2,2 and let the channel 
output, after modulation, addition of WGN, and demodulation, be v1.1, 1,2, V2.1, 02,2 - «+ 5 Uko-+2,2: 


There are 2*° possible codewords, corresponding to the 2*° possible binary ko-tuples dj,... 5g 
so an unimaginative approach to decoding would be to compare the likelihood for each of these 
codewords. For large kg, even with today’s technology, such an approach would be prohibitive. 
It turns out, however, that by using the trellis structure of Figure 8.9, this decoding effort can 
be greatly simplified. 


Each input d1,... , dx, (i.e., each codeword) corresponds to a particular path through the trellis 
from epoch 1 to k9+2, and each path, at each epoch k, corresponds to a particular trellis state. 


Consider two paths dj,... ,dx,. and dj,..., fe through the trellis that pass through the same 
state at time kt (i.e., at the time immediately after the input and state change at epoch 
k) and remain together thereafter. Thus dx41,.-- ,dko = dy41,--->d,,- For example, from 


Figure 8.8, we see that (0,...,0) and 1,0,... ,0 are both in state 00 at 3 and both remain 
in the same state thereafter. Since the two paths are in the same state at kt and have the 
same inputs after this time, they both have the same encoder outputs after this time. Thus 


ey ! —_ 
Uk+ jis +++» Uko+2,i = Ungar 1 Uko 42,4 for 1 = 1,2. 
Since each channel output rv V;,; is given by Vi; = Ur + Zp, and the Gaussian noise variables 
Z;,; are independent, this means that for any channel output v1,1,... , Ueo+2,2; 
f(via; tee , Vko+2,2/d1, tee , dk, ) = f(vis,; tee , Vk,2|d1, sae , Uke ) 
/ / / / : 
PF Orissns 9 Viggo gy ss5 a) FO eScy Deel da zee 3 ko) 


In plain English, this says that if two paths merge at time k* and then stay together, the 
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likelihood ratio depends on only the first & output pairs. Thus if the right hand side exceeds 1, 
then d1,... ,dx,. is more likely than d},... dh: This conclusion holds no matter how the final 
inputs dz41... ,d%, are chosen. 


We then see that when two paths merge at a node, no matter what the remainder of the path 
is, the most likely of the paths is the one that is most likely at the point of the merger. Thus, 
whenever two paths merge, the least likely of the paths can be eliminated at that point. Doing 
this elimination successively from the smallest k for which paths merge (3 for the example), 
there is only one survivor for each state at each epoch. 


To be specific, let h(d,,... ,dx) be the state at time k* with input d1,... ,d,. For the example, 
h(di, tenet , dk) = (dy_1, dx). Let 


EE aes nie giualdayeteyde): 
fax (Ki; 8) cs f(via,--» sve aldi ) 


These quantities can then be calculated iteratively for each state and each time k by the iteration 
fmax(k + 1,8) = max fimax(k, 7) > f(ealua(r—s)) f(vp,2[ua(r—s)). (8.90) 


where the maximization is over the set of states r that have a transition to state s in the trellis 
and ui(r—s) and u2(r—s) are the two outputs from the encoder corresponding to a transition 
from r to s. 


This expression is simplified (for WGN) by taking the log, which is proportional to the negative 
squared distance between v and wu. For the antipodal signal case in the example, this is further 
simplified by simply taking the dot product between v and wu. Letting L(k,s) be this dot 
product, 


L(k+1,s)= max L(k,r) + vgiui(r—s)) + vg,2ue(r—s)). (8.91) 


What this means is that at each epoch (k+1), it is necessary to calculate the inner product in 
(8.91) for each link in the trellis going from k to k+ 1. These must be maximized over r for 
each state s at epoch (k+1). The maximum must then be saved as L(k + 1,s) for each s. One 
must, of course, also save the paths taken in arriving at each merging point. 


Those familiar with dynamic programming will recognize this as an example of the dynamic 
programming principle. 

The entire computation for decoding a block of ko information bits is proportional to 4(ko+2). 
In the more general case where the constraint length of the convolutional coder is n rather than 
2, there are 2” states and the computation is proportional to 2"(ko9+n). The Viterbi algorithm is 
usually used in cases where the constraint length is moderate, say 6 - 12, and in these situations, 
the computation is quite moderate, expecially compared with 2*°. 


Usually one does not wait until the end of the block to start decoding. Usually when the above 
computation is done at epoch k, all the paths up to k’ have merged for k’ a few constraint lengths 
less than k. In this case, one can decode without any bound on ko, and the error probability is 
viewed in terms of “error events” rather than block error. 


8.9 Summary 


This chapter analyzed the last major segment of a general point-to-point communication system 
in the presence of noise, namely how to detect the input signals from the noisy version presented 
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at the output. Initially the emphasis was on detection alone, 7.e., the assumption was that the 
rest of the system had been designed and the only question remaining was how to extract the 
signals. 


At a very general level, the problem of detection in this context is trivial. That is, under 
the assumption that the statistics of the input and the noise are known, the sensible problem is 
maximum a posteriori probability decoding: find the a posteriori probability of all the hypotheses 
and choose the largest. This is somewhat complicated by questions of whether to do sequence 
detection or bit detection, but these questions in a sense are details. 


At a more specific level, however, the detection problem led to many interesting insights and 
simplifications, particularly for WGN channels. A particularly important simplification is the 
principle of irrelevance, which says that components of the received waveform in degrees of 
freedom not occupied by the signal of interest (or statistically related signals) can be ignored 
in detection of those signals. Looked at in another way, this said that matched filters could be 
used to extract the degrees of freedom of interest. 


The last part of the chapter introduced coding and decoding. The focus changed here from 
decoding/detection to the question of how coding could change the input waveforms so as to 
make the decoding more effective. In other words, a MAP detector can be designed for any signal 
structure, but the real problem is to design both signal structure and detection for effective 
performance. 


At this point, the noisy-channel coding theorem came into the picture. If R < C, then the 
probability of error can be reduced arbitrarily by increasing block length (or constraint length 
in the case of convolutional codes). This means that there is no “optimal” solution to the 
joint problem of choosing signal structure and detection, but rather a trade-off between error 
probability, delay, and complexity. 


Thus the problem must involve not only overcoming the noise, but doing this with reasonable 
delay and complexity. The following chapter considers some of these problems in the context of 
wireless communication. 


8A Appendix: Neyman-Pearson threshold tests 


We have seen above that any binary MAP test can be formulated as a comparison of a likelihood 
ratio with a threshold. It turns out that many other detection rules can also be viewed as 
threshold tests on likelihood ratios. One of the most important binary detection problems 
for which a threshold test turns out to be essentially optimum is the Neyman-Pearson test. 
This is often used in those situations in which there is no sensible way to choose a priori 
probabilities. In the Neyman-Pearson test, an acceptable value a is established for Pr{e|U=1}, 
and, subject to the constraint, Pr{e|U=1} < a, a Neyman-Pearson test is a test that minimizes 
Pr{e|U=0}. We shall show in what follows that such a test is essentially a threshold test. 
Before demonstrating this, we need some terminology and definitions. 


Define qo(7) to be Pr{e| U=0} for a threshold test with threshold 7, 0 < 7 < oo and similarly 
define qi(7) as Pr{e|U=1}. Thus for 0 <7 < ow, 


qo(n) = Pr{A(V)<n|U=0}; a(n) = Pr{A(V)>n| U=1}. (8.92) 


Define qo(0) as limy—o go(7) and qi(0) as lim,+0 qi(7). Clearly qo(0) = 0 and in typical situations 
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qi(0) = 1. More generally, q(0) = Pr{A(V)>0|U=1}. In other words, q;(0) is less than 1 if 
there is some set of observations that are impossible under U=0 but have positive probability 
under U=1. Similarly, define go(oo) as lim, 0 go(m) and qi(oo) as lim; —oo ai(m). We have 
go(co) = Pr{A(V) < co} and qi(co) = 0. 

Finally, for an arbitrary test A, threshold or not, denote Pr{e|U=0} as go(A) and Pr{e|U=1} 
as qi(A). 

Using (8.92), we can plot go(7) and qi(7) as parametric functions of 7; we call this the error 
curve.> Figure 8.11 illustrates this error curve for a typical detection problem such as (8.17) 
and (8.18) for antipodal binary signalling. We have already observed that, as the threshold 7 is 
increased, the set of v mapped into U=0 decreases, thus increasing qo(7) and decreasing qi(7). 
Thus, as 7 increases from 0 to oo, the curve in Figure 8.11 moves from the lower right to the 
upper left. 


qi(7) 1 
Figure 8.11: The error curve; q;(7) and qo(7) as parametric functions of 


Figure 8.11 also shows a straight line of slope —7 through the point (qi(7), qo(7)) on the error 
curve. The following lemma shows why this line is important. 


Lemma 1: For each 7, 0<7<ow, the line of slope —7 through the point (q1(7), go(7)) lies on or 
beneath all other points (q1(7’), qo(7’)) on the error curve, and also lies beneath (q1(A), qo(A)) 
for all tests A. 


Before proving this lemma, we give an example of the error curve for a discrete observation 
space. 


Example of Discrete Observations: Figure 8.12 shows the error curve for an example in 
which the hypotheses 0 and 1 are again mapped 0 — +a and 1 — —a. Assume that the 
observation V can take on only four discrete values +3,-+1,—1,—3. The probabilities of each 
these values, conditional on U=0 and U=1, are given in the figure. As indicated there, the 
likelihood ratio A(v) then takes the values 4, 3/2, 2/3, and 1/4, corresponding respectively to 
v = 3,1,-1, and —3. 

A threshold test at 7 decides U = 0 if and only if A(V) > 7. Thus, for example, for any 7 < 1/4, 
all possible values of v are mapped into U = 0. In this range, q (7) = 1 since U = 1 always 
causes an error. Also qo(7) = 0 since U = 0 never causes an error. In the range 1 [4 <= 2y3: 
since A(—3) = 1/4, the value -3 is mapped into U = 1 and all other values into U = 0. In this 
range, qi(7) = 0.6 since, when U = 1, an error occurs unless V = —3. 


3In the radar field, one often plots 1 — qo(7) as a function of qi(n). This is called the receiver operating 
characteristic (ROC). If one flips the error curve vertically around the point 1/2, the ROC results. 
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In the same way, all threshold tests with 2 /3 <1 < 3/2 give rise to the decision rule that maps 
-l and -3 into U = 1 and 1 and 3 into U = 0. In this range qi(7) = go(7) = 0.3. As shown, there 
is another decision rule for 3/2 < 7 < 4 and a final decision rule for 7 > 4. 


1@U=1 forallv 


v P(v|0) Poll) A(v) 


3. 0.4 0.1 4 

1 0.3 0.2 3/2 : 
-l 0.2 0.3 2/3 0.3 <@U =1 for v = —1,-3 
-3 «(0.1 0.4 1/4 


U = 1 forv=—3 
a ree 


Figure 8.12: The error curve for a discrete observation space. There are only five points 
making up the ‘curve,’ one corresponding to each of the five distinct threshold rules. For 
example, the threshold rule U = 1 only for v = —3, yields (qi(7), go(7)) = (0.6,0.1) for all 
7 in the range 1/4 to 2/3. A straight line of slope —7 through that point is also shown for 
n = 1/2. The lemma asserts that this line lies on or beneath each point of the error curve and 
each point (qi(A), qo(A) for any other test. Note that as 7 increases or decreases, this line will 
rotate around the point (0.6,0.1) until 7 becomes larger than 2/3 or smaller than 1/4, and 
then starts to rotate around the next point in the error curve. 


The point of this example is that a finite observation space leads to an error curve that is simply 
a finite set of points. It is also possible for a continuously varying set of outputs to give rise 
to such an error curve when there are only finitely many possible likelihood ratios. The figure 
illustrates what the lemma means for error curves consisting only of a finite set of points. 


Proof of lemma: Consider the line of slope —7 through the point (qi(7), qgo(7)). From plane 
geometry, as illustrated in Figure 8.11, we see that the vertical axis intercept of this line is 
go(n)+nqi(7). To interpret this line, define po and p; as a priori probabilities such that 7 = p1/po. 
The overall error probability for the corresponding MAP test is then 


a(n) = pogo(n) + prai(7) 
Po [go(7) + na(n)l; 7 = pi/Po. (8.93) 


Similarly, the overall error probability for an arbitrary test A with the same a priori probabilities 
is 


q(A) = Po [qo(A) + n91(A)]. (8.94) 
From Theorem 8.1.1, g(7) < q(A), so, from (8.93) and (8.94), 


go(n) +n 4qi(n) < qgo(A) + nqi(A). (8.95) 


We have seen that the left side of (8.95) is the vertical axis intercept of the line of slope —7 
through (qi(7), go(7)). Similarly, the right side is the vertical axis intercept of the line of slope 
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—n through (qi(A),qo(A)). This says that the point (qi(A),qo(A)) lies on or above the line of 
slope —7 through (qi(7),qo(7)). This applies to every test A, which includes every threshold 
test. 


The lemma shows that if the error curve gives go(7) as a differentiable function of qi(7) (as in 
the case of Figure 8.11), then the line of slope —7 through (qi(7), qo(7)) is a tangent, at point 
(a(7), Go(7)), to the error curve. Thus in what follows we denote this line as the 7-tangent to 
the error curve. Note that the error curve of Figure 8.12 is not really a curve at all, but the 
n-tangent, as defined above and illustrated in the figure for 7 = 2/3, still lies on or beneath all 
points of the error curve and all achievable points (q1(A),qo(A)), as proven above. 


Since, for each test A, the point (q1(A),qo(A)) lies on or above each 7-tangent, it also lies on or 
above the supremum of these 7-tangents over 0 < 7 < co. It also follows, then, that for each 
n', 0 <1 < ov, (qi(7’), Go(7’)) lies on or above this supremum. Since (q1(7’), go(7’)) also lies on 
the 7/-tangent, it lies on or beneath the supremum, and thus must lie on the supremum. We 
conclude that each point of the error curve lies on the supremum of the 7-tangents. 


Although all points of the error curve lie on the supremum of the 7-tangents, all points of the 
supremum are not necessarily points of the error curve, as seen from Figure 8.12. We shall 
see shortly, however, that all points on the supremum are achievable by a simple extension of 
threshold tests. Thus we call this supremum the extended error curve. 


For the example in Figure 8.11 the extended error curve is the same as the error curve itself. 
For the discrete example in Figure 8.12, the extended error curve is shown in Figure 8.13. 


a(n) 1 


Figure 8.13: The extended error curve for the discrete observation example of Figure 8.12. 
From Lemma 1, for each slope —7, the 7-tangent touches the error curve. Thus, the line 
joining two adjacent points on the error curve must be an 7-tangent for its particular slope, 
and therefore must lie on the extended error curve. 


To understand the discrete case better, assume that the extended error function has a straight 
line portion of slope —7* and horizontal extent y. This implies that the distribution function 
of A(V) given U=1 has a discontinuity of magnitude y at 7*. Thus there is a set V* of one 
or more v with A(v) = 7*, Pr{V*|U=1} = 7, and Pr{V*|U=0} = n*y. For a MAP test with 
threshold 7*, the overall error probability is not effected by whether v € V* is detected as U=0 
or U=1. Our convention is to detect v € V* as U=0, which corresponds to the lower right point 
on the straight line portion of the extended error curve. The opposite convention, detecting 
v € V* as U=1 reduces the error probability given U=1 by y and increases the error probability 
given U=0 by 7*¥4, 7.e., it corresponds to the upper left point on the straight line portion of the 
extended error curve. 
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Note that when we were interested in MAP detection, it made no difference how v € V* was 
detected for the threshold *. For the Neyman-Pearson test, however, it makes a great deal of 
difference since go(7*) and qi(n*) are changed. In fact, any point on the straight line in question 
can be achieved by detecting v € V* randomly. As the probability of choosing U=0 is increased 
from 0 to 1 (given v € V*), the point (qo(7), q1(7)) moves from the upper left to lower right end 
of the given line segment. In other words, the extended error curve is the curve relating q; to 
go using a randomized threshold test. For a given 7*, of course, only those v € V* are detected 
randomly. 


To summarize, the Neyman-Pearson test is a randomized threshold test. For a constraint a on 
Pr{e|U=1}, we choose the point a on the abscissa of the extended error curve and achieve the 
corresponding ordinate as the minimum Pr{e|U=1}. If that point on the extended error curve 
lies within a straight line segment of slope 7*, a randomized test is used for those observations 
with likelihood ratio 7*. 


Since the extended error curve is a supremum of straight lines, it is a convex function. Since 
these straight lines all have negative slope, it is a monotonic decreasing function. Thus, Figures 
8.11 and 8.13 represent the general behavior of extended error curves, with the slight possible 
exception mentioned above that the end points need not have one of the error probabilities equal 
to 1. 


The following theorem summarizes the results about Neyman-Pearson tests. 


Theorem 8A.1. The extended error curve is convex and_ strictly decreasing between 
(qi(0o), go(co)) and (qi(0),qo(0)). For a constraint a on Pr{e|U=1}, the minimum value of 
Pr{e|U=0} is given by the ordinate of the extended error curve corresponding to the abscissa a 
and is achieved by a randomized threshold test. 


There is one more interesting variation on the theme of threshold tests. If the a priori prob- 
abilities are unknown, we might want to minimize the maximum probability of error. That 
is, we visualize choosing a test followed by nature choosing a priori probabilities to maximize 
the probability of error. Our objective is to minimize the probability of error under this worst 
case assumption. The resulting test is called a minmax test. It can be seen geometrically from 
Figures 8.11 or 8.13 that the minmax test is the randomized threshold test at the intersection 
of the extended error curve with a 45° line from the origin. 


If there is symmetry between U = 0 and U = 1 (as in the Gaussian case), then the extended 
error curve will be symmetric around the 45° degree line, and the threshold will be at 7 = 1 (7.e., 
the ML test is also the minmax test). This is an important result for Gaussian communication 
problems, since it says that ML detection, 7.e., minimum distance detection is robust in the 
sense of not depending on the input probabilities. If we know the a priori probabilities, we can 
do better than the ML test, but we can do no worse. 


™To be more precise, it is strictly decreasing between the end points (qi(0o), go(oo)) and (q1(0), qo(0)). 
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8.E Exercises 


8.1. (Binary minimum cost detection) (a) Consider a binary hypothesis testing problem with a 
priori probabilities po, p; and likelihoods f,,.,,(v|t), 7 = 0,1. Let Ci; be the cost of deciding 
on hypothesis 7 when 7 is correct. Conditional on an observation V = v, find the expected 
cost (over U = 0,1) of making the decision U = j for 7 = 0,1. Show that the decision of 
minimum expected cost is given by 


U gine = arg min, [Corp piv (Olv) ate C1pPyy Alv) 
(b) Show that the min cost decision above can be expressed as the following threshold test: 


Fry (v9) >U= pi(Cio — Ci) __ 


MeN fy (l1) <ga po(Coi — Coo) 


(c) Interpret the result above as saying that the only difference between a MAP test and 
a minimum cost test is an adjustment of the threshold to take account of the costs. i.e., 
a large cost of an error of one type is equivalent to having a large a priori probability for 
that hypothesis. 


8.2. Consider the following two equiprobable hypotheses: 


U=0 : Vj=acos0+ 21, Vo =asinO + Zo, 
U=1 : YVj~=-—acosO+ 2, Vo = —asinO + Zo. 


Z, and Zz are iid N’(0, 07), and © takes on the values {—7/4,0,7/4} each with probability 
1/3. 

Find the ML decision rule when V,, V2 are observed. 

Hint: Sketch the possible values of Vj, V2 for Z = 0 given each hypothesis. Then, without 
doing any calculations try to come up with a good intuitive decision rule. Then try to 
verify that it is optimal. 


8.3. Let 
Vj = 8;Xj 4+ Z; forl<j<4 
where {Xj;1 < j < 4} are iid N(0,1) and {Z;;1 < j < 4} are iid V'(0, 0?) and independent 
of {X;;1 <j <4}. {Vj;1 <7 < 4} are observed at the output of a communication system 
and the input is a single binary random variable U which is independent of {Z;;1 < j < 4} 
and {Xj;1 < j < 4}. Given that U = 0, S; = Sp = 1 and $3 = S54 = 0. Given U = 1, 
S; = So =0 and $3 = S4=1. 


(a) Find the log likelihood ratio 


LLR(v) =1n( Friu(vl0 ) 


fy (e"|1 


(b) Let Eq = |Vil? + |Va|? and & = |V3|? + |Vs|?. Explain why {€, €} form a sufficient 
statistic for this problem and express the log likelihood ratio in terms of the sample values 
of 4. fake 
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(c) Find the threshold for ML detection. 


(d) Find the probability of error. Hint: Review Exercise 6.1. Note: we will later see that 
this corresponds to binary detection in Rayleigh fading. 


8.4. Consider binary antipodal MAP detection for the real vector case. Modify the picture 
and argument in Figure 8.4 to verify the algebraic relation between the squared energy 
difference and the inner product in (8.21). 


8.5. Derive (8.35), é.e., that 0), 5 ¥n,j0k,j = 5 J y(t)0(t) dt. Explain the factor of 1/2. 


8.6. In this problem, you will derive the inequalities 


1 1 2 1 2 
1-—+. EP Te Og et /2. for x > 0, 8.96 
( =i) al 27 <Q Sz Qn ( ) 


where Q(x) = (2r)71/? [°° exp(—2?/2) dz is the “tail” of the Normal distribution. The 
purpose of this is to show that, when z is large, the right side of this inequality is a very 
tight upper bound on Q(z). 

(a) By using a simple change of variable, show that 


Q(x) = as / exp (—y?/2 — zy) dy. 


(b) Show that 
1—y?/2 < exp(—y?/2) <1. 
(c) Use parts (a) and (b) to establish (8.96) 


8.7. (Other bounds on Q(x)) (a) Show that the following bound holds for any y and 7 such 
that 0 < y and 0 < nw: 


Q(y +n) < Q(y) exp[-ny - 7/2] 


Hint: Start with Q(y +7) = ‘re 


(b) Use part (a) to show that for all 7 > 0, 


exp[—2x?/2] dx and use the change of variable y = x — 7. 


1 
Q(n) < 5 exp[-1"/2] 
(c) Use (a) to show that for all 0 <y<w, 


Qw) a) 
exp] ~ expl-F79 


Note: (8.96) shows that Q(w) goes to 0 with increasing w as a slowly varying coefficient 
time exp[—w?/2]. This demonstrates that the coefficient is decreasing for w > 0. 
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8.8. (Orthogonal signal sets) An orthogonal signal set is a set A= {@m,0<m< M—1} of M 
orthogonal vectors in RY” with equal energy E; i.e., (am, a;) = Edbm;. 
(a) Compute the normalized rate p of A in bits per two dimensions. Compute the average 


energy Ey per information bit. 
2 


(b) Compute the minimum squared distance d<,,,(A) between these signal points. Show 
that every signal has M—1 nearest neighbors. 

(c) Let the noise variance be No/2 per dimension. Describe a ML detector on this set of 
M signals. Hint: Represent the signal set in an orthonormal expansion where each vector 
is collinear with one coordinate. Then visualize making binary decisions between each pair 


of possible signals. 


8.9. (Orthogonal signal sets; continuation of Exercise 8.8) Consider a set A = {am,0 <m < 
M — 1} of M orthogonal vectors in R“” with equal energy E. 


(a) Use the union bound to show that Pr{e}, using ML detection, is bounded by 
Pr{e} < (M —1)Q(/E/No). 


(b) Let M — co with E, = E/log M held constant. Using the upper bound for Q(z) in 
Exercise 8.7b, show that if E,/No > 21n2 then limjy. Pr(e) = 0. How close is this to 
the ultimate Shannon limit on E,/No? What is the limit of the normalized rate p? 


8.10. (Lower bound to Pr(e) for orthogonal signals) (a) Recall the exact expression for error 
probability for orthogonal signals in WGN from (8.47), 


m=1 


3 M-1 
Pre) = i fwo|A(Wo] @o) Pr ( U (Win > wo|A = «)) dwo. 


Explain why the events W,, > wo for 1 <m < M —1 are iid conditional on A = ao and 
Wo =" 
(b) Demonstrate the following two relations for any wo, 


M-1 
Pr ( U (Wim > wo|A = «)) = 1—[1—Q(wo)])@" 


m=1 


= (M— 1)Q(wo) 


(c) Define 7, by (M — 1)Q(1) = 1. Demonstrate the following: 


ra (M1) Q(wo) 
e(U (Win > A= a) >= { 3 for wo>%1 


1 
ee 5 for wo<1 


(d) Show that 


Pr(e) >= 5Q(a— 1) 


N[ rR 


(e) Show that limyy—. 91/7 = 1 where 7 = V21In M. Use this to compare the lower bound 
in part (d) to the upper bounds for cases 1 and 2 in Subsection 8.5.3. In particular show 
that Pr(e) > 1/4 for 71 > a (the case where capacity is exceeded). 
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(f) Derive a tighter lower bound on Pr(e) than part (d) for the case where 7, < a. Show 
that the ratio of the log of your lower bound and the log of the upper bound in Subsection 
8.5.3 approaches 1 as M — oo. Note: this is much messier than the bounds above. 


8.11. Section 8.3.4 discusses detection for binary complex vectors in WGN by viewing complex n- 
dimensional vectors as 2n-dimensional real vectors. Here you will treat the vectors directly 
as n-dimensional complex vectors. Let Z = (Z1,...,Zn)' be a vector of complex iid 
Gaussian rv’s with iid real and imaginary parts, each V(0,.No/2). The input U is binary 
antipodal, taking on values a or —a, The observation V is U + Z, 


(a) The probability density of Z is given by 


n 2 2 
—|2; 1 —||z 
Pye senile 
j 


f,(2) = 


(7No)” No 


Explain what this probability density represents (i.e., probability per unit what?). 
(b) Give expressions for f,,,(v|@) and fy,,,(v| — a). 


(c) Show that the log likelihood ratio for the observation v is given by 


=||v = al? + |v + al? 
No 


LLR(v) = 


(d) Explain why this implies that ML detection is minimum distance detection (defining 
the distance between two complex vectors as the norm of their difference). 


(e) Show that LLR(v) can also be written as ne a) my 


(f) The appearance of the real part, R((v, a)), above is surprising. Point out why log 
likelihood ratios must be real. Also explain why replacing R((v, a)) by |(v, a)| in the 
above expression would give a non-sensical result in the ML test. 


(g) Does the set of points {v : LLR(v) = 0} form a complex vector space? 
8.12. Let D be the function that maps vectors in C” into vectors in R2” by the mapping 


@ = (41, 02,... , Gn) > (Rar, Rae,... , Ran, Fai, Fag,... , San) = D(a) 


(a) Explain why a € C” and ia (i = V—1)are contained in the one dimensional complex 
subspace of C” spanned by a. 


(b) Show that D(a) and D(ia) are orthogonal vectors in R2”. 
c) For v,a €C”, the projection of v on a is given by vq = (4) a Show that D(v 7 
| | 


ell llell* 


is the projection of D(v) onto the subspace of R2” spanned by D(a) and D(ia). 
(d) Show that D(Eiea)| 72) is the further projection of D(v) onto D(a). 


ell lel 


8.13. Consider 4-QAM with the 4 signal points u = ta+ia. Assume Gaussian noise with spectral 
density No/2 per dimension. 


(a) Sketch the signal set and the ML decision regions for the received complex sample value 
y. Find the exact probability of error (in terms of the Q function) for this signal set using 
ML detection. 

(b) Consider 4-QAM as two 2-PAM systems in parallel. That is, a ML decision is made 
on R(u) from R(v) and a decision is made on S(u) from S(v). Find the error probability 
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(in terms of the Q function) for the ML decision on R(u) and similarly for the decision on 
(c) Explain the difference between what has been called an error in part (a) and what has 
been called an error in part (b). 


(d) Derive the QAM error probability directly from the PAM error probability. 
8.14. Consider two 4-QAM systems with the same 4-QAM constellation 


So=1+i7, sy=—1lt+it, sg=—-l-—-it, sg =1-1. 
For each system, a pair of bits is mapped into a signal, but the two mappings are different: 


Mapping 1: 00— sy, Ol—s1, 10-52, ll—s83 
Mapping 2: 00— sj, Ol-—s,, ll—so, 10—s3 


The bits are independent and 0’s and 1’s are equiprobable, so the constellation points are 
equally likely in both systems. Suppose the signals are decoded by the minimum distance 
decoding rule, and the signal is then mapped back into the two binary digits. Find the 
error probability (in terms of the Q function) for each bit in each of the two systems. 


8.15. Re-state Theorem 8.4.1 for the case of MAP detection. Assume that the inputs Uj,... ,Un 
are independent and each have the a priori distribution pop,... ,pag_1. Hint: start with 
(8.41) and (8.42) which are still valid here. 


8.16. The following problem relates to a digital modulation scheme often referred to as minimum 
shift keying (MSK). Let 


2H : 
sa 2 fot fO0<t<T 
so(t) -_ T cos( T fo ) 1 = Ae ’ 
0 otherwise. 


s(t) one fO<t<T, 
1 = 


0 otherwise. 


a) Compute the energy of the signals so(t),s;(¢). You may assume that foT >> 1 and 
fit > 1. 

(b) Find conditions on the frequencies fo, f; and the duration T to ensure both that the 
signals so(t) and s;(t) are orthogonal and that so(0) = so(Z) = si(0) = si(T). Why do 
you think a system with these parameters is called minimum shift keying? 

(c) Assume that the parameters are chosen as in (b). Suppose that, under U=0, the 
signal so(t) is transmitted, and under U=1, the signal s;(¢) is transmitted. Assume that 
the hypotheses are equally likely. Let the observed signal be equal to the sum of the 
transmitted signal and a White Gaussian process with spectral density No/2. Find the 
optimal detector to minimize the probability of error. Draw a block diagram of a possible 
implementation. 


(d) Compute the probability of error of the detector you have found in part (c). 
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8.17. Consider binary communication to a receiver containing kg antennas. The transmitted 
signal is ta. Each antenna has its own demodulator, and the received signal after demod- 
ulation at antenna k, 1 < k < ko, is given by 


Ve = Uge t+ Zr, 


where U is +a for U=0 and —a for U=1. Also gp is the gain of antenna k and Z;, ~ N (0,07) 
is the noise at antenna k; everything is real and U, Z1, Z2,... , Zp, are independent. In 
vector notation, V = Ug+ Z where V = (v1,... , Ug.) ' ete. 


(a) Suppose that the signal at each receiving antenna & is weighted by an arbitrary real 
number gq, and the signals are combined as Y = 5°, Vege = (V,q). What is the maximum 
likelihood (ML) detector for U given the observation Y? 

(b) What is the probability of error Pr(e) for this detector? 


(c) Let B= TI Express Pr(e) in a form where qg does not appear except for its effect 
on (3. 


(d) Give an intuitive explanation why changing q to cq for some nonzero scalar c does not 
change Pr(e). 
(e) Minimize Pr(e) over all choices of q (or G) above. 


(f) Is it possible to reduce Pr(e) further by doing ML detection on Vj,... , Vg, rather than 
restricting ourselves to a linear combination of those variables? 


(g) Redo part (b) under the assumption that the noise variables have different variances, 
i.e., Zp ~ N (0,02). As before, U, Z1,... , Zo are independent. 
(h) Minimize Pr(e) in part (g) over all choices of q. 


8.18. (a) The Hadamard matrix H; has the rows 00 and 01. Viewed as binary codewords this 

is rather foolish since the first binary digit is always 0 and thus carries no information at 
all. Map the symbols 0 and 1 into the signals a and —a respectively, a > 0 and plot these 
two signals on a two dimensional plane. Explain the purpose of the first bit in terms of 
generating orthogonal signals. 
(b) Assume that the mod-2 sum of each pair of rows of Hy is another row of Hy for any 
given integer b > 1. Use this to prove the same result for Hy,,. Hint: Look separately at 
the mod-2 sum of two rows in the first half of the rows, two rows in the second half, and 
two rows in different halves. 


8.19. (RM codes) (a) Verify the following combinatorial identity for 0 < r < m: 


eC) oe poe ') 


j=0 j=0 j=0 


Hint: Note that the first term above is the number of binary m tuples with r or fewer 1’s. 
Consider separately the number of these that end in 1 and end in 0. 
(b) Use induction on m to show that k(r,m) = )1_9 (7). Be careful how you handle r = 0 
and r=m. 

8.20. (RM codes) This exercise first shows that RM(r,m) C RM(r+1,m) for 0 <r<m. It then 
shows that dmin(r,m) = 27". 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


304 CHAPTER 8. DETECTION, CODING, AND DECODING 


(a) Show that if RM(r—1,m—1) C RM(r, m—1) for all r, 0 < r < m, then 
RM(r—1,m) Cc RM(r,m) for allr, O0<r<m 


Note: Be careful about r= 1 and r =m. 

(b) Let « = (u,u @ v) where u € RM(r,m—1) and v € RM(r—1,m—1). Assume that 
dmin(r, m—1) < 2™-!T and dmin(r—1,m—1) < 2”~". Show that if x is nonzero, it has at 
least 2" 1’s. Hint 1: For a linear code, dmin is equal to the weight (number of ones) in 
the minimum-weight nonzero codeword. Hint 2: First consider the case v = 0, then the 
case u = 0. Finally use part (a) in considering the case u 4 0,v 4 0 under the subcases 
u=vanduF v. 

(c) Use induction on m to show that dmin = 2" for 0 <r<m. 
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Chapter 9 


Wireless digital communication 


9.1 Introduction 


This chapter provides a brief treatment of wireless digital communication systems. More exten- 
sive treatments are found in many texts, particularly [32] and [9] As the name suggests, wireless 
systems operate via transmission through space rather than through a wired connection. This 
has the advantage of allowing users to make and receive calls almost anywhere, including while 
in motion. Wireless communication is sometimes called mobile communication since many of 
the new technical issues arise from motion of the transmitter or receiver. 


There are two major new problems to be addressed in wireless that do not arise with wires. The 
first is that the communication channel often varies with time. The second is that there is often 
interference between multiple users. In previous chapters, modulation and coding techniques 
have been viewed as ways to combat the noise on communication channels. In wireless systems, 
these techniques must also combat time-variation and interference. This will cause major changes 
both in the modeling of the channel and the type of modulation and coding. 


Wireless communication, despite the hype of the popular press, is a field that has been around for 
over a hundred years, starting around 1897 with Marconi’s successful demonstrations of wireless 
telegraphy. By 1901, radio reception across the Atlantic Ocean had been established, illustrating 
that rapid progress in technology has also been around for quite a while. In the intervening 
hundred years, many types of wireless systems have flourished, and often later disappeared. For 
example, television transmission, in its early days, was broadcast by wireless radio transmitters, 
which is increasingly being replaced by cable or satellite transmission. Similarly, the point- 
to-point microwave circuits that formerly constituted the backbone of the telephone network 
are being replaced by optical fiber. In the first example, wireless technology became outdated 
when a wired distribution network was installed; in the second, a new wired technology (optical 
fiber) replaced the older wireless technology. The opposite type of example is occurring today 
in telephony, where cellular telephony is partially replacing wireline telephony, particularly in 
parts of the world where the wired network is not well developed. The point of these examples is 
that there are many situations in which there is a choice between wireless and wire technologies, 
and the choice often changes when new technologies become available. 


Cellular networks will be emphasized in this chapter, both because they are of great current 
interest and also because they involve a relatively simple architecture within which most of the 
physical layer communication aspects of wireless systems can be studied. A cellular network 
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consists of a large number of wireless subscribers with cellular telephones (cell phones) that can 
be used in cars, buildings, streets, etc. There are also a number of fixed base stations arranged 
to provide wireless electromagnetic communication with arbitrarily located cell phones. 


The area covered by a base station, i.e., the area from which incoming calls can reach that base 
station, is called a cell. One often pictures a cell as a hexagonal region with the base station in 
the middle. One then pictures a city or region as being broken up into a hexagonal lattice of cells 
(see Figure 9.1a). In reality, the base stations are placed somewhat irregularly, depending on the 
location of places such as building tops or hill tops that have good communication coverage and 
that can be leased or bought (see Figure 9.1b). Similarly, the base station used by a particular 
cell phone is selected more on the basis of communication quality than of geographic distance. 


mie 


(a) (b) 


Part (a): an oversimplified view Part (b): a more realistic case where base 
in which each cell is hexagonal. stations are irregularly placed and cell phones 
choose the best base station 


Figure 9.1: Cells and Base stations for a cellular network 

Each cell phone, when it makes a call, is connected (via its antenna and electromagnetic radi- 
ation) to the base station with the best apparent communication path. The base stations in 
a given area are connected to a mobile telephone switching office (MTSO) by high speed wire, 
fiber, or microwave connections. The MTSO is connected to the public wired telephone network. 
Thus an incoming call from a cell phone is first connected to a base station and from there to the 
MTSO and then to the wired network. From there the call goes to its destination, which might 
be another cell phone, or an ordinary wire line telephone, or a computer connection. Thus, we 
see that a cellular network is not an independent network, but rather an appendage to the wired 
network. The MTSO also plays a major role in coordinating which base station will handle a 
call to or from a cell phone and when to hand-off a cell phone conversation from one base station 
to another. 


When another telephone (either wired or wireless) places a call to a given cell phone, the reverse 
process takes place. First the cell phone is located and an MTSO and nearby base station is 
selected. Then the call is set up through the MTSO and base station. The wireless link from 
a base station to a cell phone is called the downlink (or forward) channel, and the link from a 
cell phone to a base station is called the uplink (or reverse) channel. There are usually many 
cell phones connected to a single base station. Thus, for downlink communication, the base 
station multiplexes the signals intended for the various connected cell phones and broadcasts 
the resulting single waveform from which each cell phone can extract its own signal. This set 
of downlink channels from a base station to multiple cell phones is called a broadcast channel. 
For the uplink channels, each cell phone connected to a given base station transmits its own 
waveform, and the base station receives the sum of the waveforms from the various cell phones 
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plus noise. The base station must then separate and detect the signals from each cell phone 
and pass the resulting binary streams to the MTSO. This set of uplink channels to a given base 
station is called a multiaccess channel. 


Early cellular systems were analog. They operated by directly modulating a voice waveform 
on a carrier and transmitting it. Different cell phones in the same cell were assigned different 
modulation frequencies, and adjacent cells used different sets of frequencies. Cells sufficiently far 
away from each other could reuse the same set of frequencies with little danger of interference. 


All of the newer cellular systems are digital (i.e., use a binary interface), and thus, in principle, 
can be used for voice or data. Since these cellular systems, and their standards, originally focused 
on telephony, the current data rates and delays in cellular systems are essentially determined by 
voice requirements. At present, these systems are still mostly used for telephony, but both the 
capability to send data and the applications for data are rapidly increasing. Also the capabilities 
to transmit data at higher rates than telephony rates are rapidly being added to cellular systems. 


As mentioned above, there are many kinds of wireless systems other than cellular. First there 
are the broadcast systems such as AM radio, FM radio, TV, and paging systems. All of these 
are similar to the broadcast part of cellular networks, although the data rates, the size of the 
areas covered by each broadcasting node, and the frequency ranges are very different. 


In addition, there are wireless LANs (local area networks). These are designed for much higher 
data rates than cellular systems, but otherwise are somewhat similar to a single cell of a cellular 
system. These are designed to connect PC’s, shared peripheral devices, large computers, etc. 
within an office building or similar local environment. There is little mobility expected in such 
systems and their major function is to avoid stringing a maze of cables through an office building. 
The principal standards for such networks are the 802.11 family of IEEE standards. There is 
a similar even smaller-scale standard called Bluetooth whose purpose is to reduce cabling and 
simplify transfers between office and hand held devices. 


Finally, there is another type of LAN called an ad hoc network. Here, instead of a central node 
(base station) through which all traffic flows, the nodes are all alike. These networks organize 
themselves into links between various pairs of nodes and develop routing tables using these links. 
The network layer issues of routing, protocols, and shared control are of primary concern for ad 
hoc networks; this is somewhat disjoint from our focus here on physical-layer communication 
issues. 


One of the most important questions for all of these wireless systems is that of standardiza- 
tion. Some types of standardization are mandated by the Federal Communication Commission 
(FCC) in the USA and corresponding agencies in other countries. This has limited the available 
bandwidth for conventional cellular communication to three frequency bands, one around 0.9 
gH, another around 1.9 gH, and the other around 5.8 gH. Other kinds of standardization are 
important since users want to use their cell phones over national and international areas. There 
are three well established mutually incompatible major types of digital cellular systems. One is 
the GSM system,! which was standardized in Europe and is now used worldwide, another is a 
TDM (Time Division Modulation) standard developed in the U.S, and a third is CDMA (Code 
Division Multiple Access). All of these are evolving and many newer systems with a dizzying 
array of new features are constantly being introduced. Many cell phones can switch between 
multiple modes as a partial solution to these incompatibility issues. 


‘GSM stands for Groupe Speciale Mobile or Global Systems for Mobile Communication, but the acronym is 
far better known and just as meaningful as the words. 
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This chapter will focus primarily on CDMA, partly because so many newer systems are using 
this approach, and partly because it provides an excellent medium for discussing communication 
principles. GSM and TDM will be discussed briefly, but the issues of standardization are so 
centered on non-technological issues and so rapidly changing that they will not be discussed 
further. 


In thinking about wireless LAN’s and cellular telephony, an obvious question is whether they 
will some day be combined into one network. The use of data rates compatible with voice rates 
already exists in the cellular network, and the possibility of much higher data rates already exists 
in wireless LANs, so the question is whether very high data rates are commercially desirable 
for standardized cellular networks. The wireless medium is a much more difficult medium for 
communication than the wired network. The spectrum available for cellular systems is quite 
limited, the interference level is quite high, and rapid growth is increasing the level of interference. 
Adding higher data rates will exacerbate this interference problem even more. In addition, the 
display on hand held devices is small, limiting the amount of data that can be presented and 
suggesting that many applications of such devices do not need very high data rates. Thus it is 
questionable whether very high-speed data for cellular networks is necessary or desirable in the 
near future. On the other hand, there is intense competition between cellular providers, and 
each strives to distinguish their service by new features requiring increased data rates. 


Subsequent sections begin the study of the technological aspects of wireless channels, focusing 
primarily on cellular systems. Section 9.2 looks briefly at the electromagnetic properties that 
propagate signals from transmitter to receiver. Section 9.3 then converts these detailed elec- 
tromagnetic models into simpler input/output descriptions of the channel. These input/output 
models can be characterized most simply as linear time-varying filter models. 


The input/output model above views the input, the channel properties, and the output at 
passband. Section 9.4 then finds the baseband equivalent for this passband view of the channel. 
It turns out that the channel can then be modeled as a complex baseband linear time-varying 
filter. Finally, in section 9.5, this deterministic baseband model is replaced by a stochastic 
model. 


The remainder of the chapter then introduces various issues of communication over such a 
stochastic baseband channel. Along with modulation and detection in the presence of noise, we 
also discuss channel measurement, coding, and diversity. The chapter ends with a brief case 
study of the CDMA cellular standard, IS95. 


9.2 Physical modeling for wireless channels 


Wireless channels operate via electromagnetic radiation from transmitter to receiver. In prin- 
ciple, one could solve Maxwell’s equations for the given transmitted signal to find the electro- 
magnetic field at the receiving antenna. This would have to account for the reflections from 
nearby buildings, vehicles, and bodies of land and water. Objects in the line of sight between 
transmitter and receiver would also have to be accounted for. 


The wavelength A(f) of electromagnetic radiation at any given frequency f is given by A = c/f, 
where c = 3 x 10° meters per second is the velocity of light. The wavelength in the bands 
allocated for cellular communication thus lies between 0.05 and 0.3 meters. To calculate the 
electromagnetic field at a receiver, the locations of the receiver and the obstructions would have 
to be known within sub-meter accuracies. The electromagnetic field equations therefore appear 
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to be unreasonable to solve, especially on the fly for moving users. Thus, electromagnetism 
cannot be used to characterize wireless channels in detail, but it will provide understanding 
about the underlying nature of these channels. 


One important question is where to place base stations, and what range of power levels are then 
necessary on the downlinks and uplinks. To a great extent, this question must be answered 
experimentally, but it certainly helps to have a sense of what types of phenomena to expect. 
Another major question is what types of modulation techniques and detection techniques look 
promising. Here again, a sense of what types of phenomena to expect is important, but the 
information will be used in a different way. Since cell phones must operate under a wide variety 
of different conditions, it will make sense to view these conditions probabilistically. Before 
developing such a stochastic model for channel behavior, however, we first explore the gross 
characteristics of wireless channels by looking at several highly idealized models. 


9.2.1 Free space, fixed transmitting and receiving antennas 


First consider a fixed antenna radiating into free space. In the far field,” the electric field and 
magnetic field at any given location d are perpendicular both to each other and to the direction 
of propagation from the antenna. They are also proportional to each other, so we focus on only 
the electric field (just as we normally consider only the voltage or only the current for electronic 
signals). The electric field at d is in general a vector with components in the two co-ordinate 
directions perpendicular to the line of propagation. Often one of these two components is zero 
so that the electric field at d can be viewed as a real-valued function of time. For simplicity, we 
look only at this case. The electric waveform is usually a passband waveform modulated around 
a carrier, and we focus on the complex positive frequency part of the waveform. The electric 
far-field response at point d to a transmitted complex sinusoid, exp(27i ft), can be expressed as 

E(f,t, d) - as(O, 0, f) exp{2mif(t— r/c)} (9.1) 


r 


Here (r, 6,2) represents the point d in space at which the electric field is being measured; r is 
the distance from the transmitting antenna to d and (6, 7) represents the vertical and horizontal 
angles from the antenna to d. The radiation pattern of the transmitting antenna at frequency 
f in the direction (6,7) is denoted by the complex function a,(6,~, f). The magnitude of a, 
includes antenna losses; the phase of aw; represents the phase change due to the antenna. The 
phase of the field also varies with fr/c, corresponding to the delay r/c caused by the radiation 
traveling at the speed of light c. 


We are not concerned here with actually finding the radiation pattern for any given antenna, 
but only with recognizing that antennas have radiation patterns, and that the free space far 
field depends on that pattern as well as on the 1/r attenuation and r/c delay. 


The reason why the electric field goes down with 1/r in free space can be seen by looking at 
concentric spheres of increasing radius r around the antenna. Since free space is lossless, the 
total power radiated through the surface of each sphere remains constant. Since the surface area 
is increasing with r?, the power radiated per unit area must go down as 1/r?, and thus E must 
go down as 1/r. This does not imply that power is radiated uniformly in all directions - the 


The far field is the field many wavelengths away from the antenna, and (9.1) is the limiting form as this 
number of wavelengths increase. It is a safe assumption that cellular receivers are in the far field. 
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radiation pattern is determined by the transmitting antenna. As seen later, this r~? reduction of 
power with distance is sometimes invalid when there are obstructions to free space propagation. 


Next, suppose there is a fixed receiving antenna at location d = (r,@,W). The received waveform 
at the antenna terminals (in the absence of noise) in response to exp(27ift) is then 


a9, f) exp{2aif(t — r/c)} 


r 


(9.2) 


where a(@,~, f) is the product of a, (the antenna pattern of the transmitting antenna) and the 
antenna pattern of the receiving antenna; thus the losses and phase changes of both antennas 
are accounted for in a(@,v, f). The explanation for this response is that the receiving antenna 
causes only local changes in the electric field, and thus alters neither the r/c delay nor the 1/r 
attenuation. 


For the given input and output, a system function h( f) can be defined as 


ACP) 2 a(é, wv, f) sdf LZ 


(9.3) 


Substituting this in (9.2), the response to exp(2mift) is h(f) exp{27i ft}. 


Electromagnetic radiation has the property that the response is linear in the input. Thus 
the response at the receiver to a superposition of transmitted sinusoids is simply the su- 
perposition of responses to the individual sinusoids. The response to an arbitrary input 


x(t) = { &(f) exp{27i ft} df is then 


yit) = f a(t fexp{amise) a, (9.4) 
We see from (9.4) that the Fourier transform of the output y(t) is 9(f) = #(f)h(f). From the 
convolution theorem, this means that 


WeS | Mere cm (9.5) 


—co 


where h(t) = [~ h(f) exp{2mi ft} df is the inverse Fourier transform of h(f). Since the physical 


(oe) 
input and output must be real, 2(f) = 2*(—f) and §(f) = g*(—f). It is then necessary that 
h(f) = h*(—f) also. 
The channel in this free space example is thus a conventional linear time-invariant (LTI) system 
with impulse response h(t) and system function h(f). 


For the special case where the the combined antenna pattern a(0,w, f) is real and independent 
of frequency (at least over the frequency range of interest), we see that h( f) is a complex 
exponential? in f and thus h(t) is *6(t — £) where 6 is the Dirac delta function. From (9.5), 
y(t) is then given by 


Qa - 


y(t) = walt = 3) 


If h(f) is other than a complex exponential, then h(t) is not an impulse, and y(t) becomes a 
non-trivial filtered version of x(t) rather than simply an attenuated and delayed version. From 


3More generally, h(f) is a complex exponential if |a| is independent of f and Za is linear in f. 
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(9.4), however, y(t) only depends on h(f) over the frequency band where #(f) is non-zero. Thus 
it is common to model h(f) as a complex exponential (and thus h(t) as a scaled and shifted 
Dirac delta function) whenever h(f) is a complex exponential over the frequency band of use. 


We will find in what follows that linearity is a good assumption for all the wireless channels to 
be considered, but that time invariance does not hold when either the antennas or reflecting 
objects are in relative motion. 


9.2.2 Free space, moving antenna 


Continue to assume a fixed antenna transmitting into free space, but now assume that the 
receiving antenna is moving with constant velocity v in the direction of increasing distance from 
the transmitting antenna. That is, assume that the receiving antenna is at a moving location 
described as d(t) = (r(t),@,w) with r(t) =r, + vt. In the absence of the receiving antenna, the 
electric field at the moving point d(t), in response to an input exp(277ft), is given by (9.1) as 


as(9, ¥, f) exp{2ai f(t — r,/c—vt/c)} 


E(f,t, d(t)) = rar , (9.6) 


We can rewrite f(t—r,/c—vt/c) as f(1—v/c)t — fr,/c. Thus the sinusoid at frequency f has 
been converted to a sinusoid of frequency f(1—v/c); there has been a Doppler shift of —fu/c 
due to the motion of d(t).4 Physically, each successive crest in the transmitted sinusoid has to 
travel a little further before it gets observed at this moving observation point. 


Placing the receiving antenna at d(t), the waveform at the terminals of the receiving antenna, 
in response to exp(277 ft), is given by 


a(6,¢b, f) exp{2mi[f(1—2)t — 22]} 


9.7 
ry tut ; (9.7) 


where a(6,~, f) is the product of the transmitting and receiving antenna patterns. 


This channel cannot be represented as an LTI channel since the response to a sinusoid is not a 
sinusoid of the same frequency. The channel is still linear, however, so it is characterized as a 
linear time-varying channel. Linear time-varying channels will be studied in the next section, 
but first, several simple models will be analyzed where the received electromagnetic wave also 
includes reflections from other objects. 


9.2.3. Moving antenna, reflecting wall 


Consider Figure 9.2 below in which there is a fixed antenna transmitting the sinusoid exp(27i ft). 
There is a large perfectly-reflecting wall at distance r, from the transmitting antenna. A vehicle 
starts at the wall at time t = 0 and travels toward the sending antenna at velocity v. There is a 
receiving antenna on the vehicle whose distance from the sending antenna at time t > 0 is then 
given by r, — vt. 

In the absence of the vehicle and receiving antenna, the electric field at r, — vt is the sum of 
the free space waveform and the waveform reflected from the wall. Assuming that the wall is 


“Doppler shifts of electromagnetic waves follow the same principles as Doppler shifts of sound waves. For 
example, when an airplane flies overhead, the noise from it appears to drop in frequency as it passes by. 
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Figure 9.2: Illustration of a direct path and a reflected path 


very large, the reflected wave at r, — vt is the same (except for a sign change) as the free space 
wave that would exist on the opposite side of the wall in the absence of the wall (see Figure 
9.3). This means that the reflected wave at distance r, — vt from the sending antenna has the 
intensity and delay of a free-space wave at distance r, + vt. The combined electric field at d(t) 
in response to the input exp(277ft) is then 


as(6, ww, f) exp{2nif [t — To=*t}} as(0,%, f) exp{2nif|t — mel 


E(f,t, d(t)) = (9.8) 


ro — vt Tr) + ut 


Sending 
Antenna 


Figure 9.3: Relation of reflected wave to the direct wave in the absence of a wall. 


Including the vehicle and its antenna, the signal at the antenna terminals, say y(t), is again the 
electric field at the antenna as modified by the receiving antenna pattern. Assume for simplicity 
that this pattern is identical in the directions of the direct and the reflected wave. Letting a 
denote the combined antenna pattern of transmitting and receiving antenna, the received signal 
is then 


aexp{2nif[t — foe} aexp{2mif|t — om} 


(9.9) 
rT, — vt rT, + ut 


y(t) = 


In essence, this approximates the solution of Maxwell’s equations by an approximate method 
called ray tracing. The approximation comes from assuming that the wall is infinitely large and 
that both fields are ideal far fields. 


The first term in (9.9), the direct wave, is a sinusoid of frequency f(1 + u/c); its magnitude 
is slowly increasing in t as 1/(ro — vt). The second is a sinusoid of frequency f(1 — u/c); its 
magnitude is slowly decreasing as 1/(ro + vt). The combination of the two frequencies creates 
a beat frequency at fu/c. To see this analytically, assume initially that t is very small so the 
denominator of each term above can be approximated as r,. Then, factoring out the common 
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terms in the above exponentials, y,(t) is given by 


aexp{2rif|[t — “o}} (exp{2nifut/c} — exp{—2rifut/c}) 


ys (t) 


2i aexp{2mif[t — 2]} sin{2rfut/c} 


r 


(9.10) 


0 


This is the product of two sinusoids, one at the input frequency f, which is typically on the 
order of gH, and the other at the Doppler shift fu/c, which is typically 500H or less. 


As an example, if the antenna is moving at v = 60 km/hr and if f = 900MH, this beat frequency 
is fu/c = 50H. The sinusoid at f has about 1.8 x 10” cycles for each cycle of the beat frequency. 
Thus y(t) looks like a sinusoid at frequency f whose amplitude is sinusoidally varying with 
a period of 20 ms. The amplitude goes from its maximum positive value to 0 in about 5ms. 
Viewed another way, the response alternates between being unfaded for about 5 ms and then 
faded for about 5 ms. This is called multipath fading . Note that in (9.9) the response is viewed 
as the sum of two sinusoids, each of different frequency, while in (9.10), the response is viewed 
as a single sinusoid of the original frequency with a time-varying amplitude. These are just two 
different ways to view essentially the same waveform. 


It can be seen why the denominator term in (9.9) was approximated in (9.10). When the dif- 
ference between two paths changes by a quarter wavelength, the phase difference between the 
responses on the two paths changes by 7/2, which causes a very significant change in the overall 
received amplitude. Since the carrier wavelength is very small relative to the path lengths, the 
time over which this phase change is significant is far smaller than the time over which the 
denominator changes significantly. The phase changes are significant over millisecond intervals, 
whereas the denominator changes are significant over intervals of seconds or minutes. For mod- 
ulation and detection, the relevant time scales are milliseconds or less, and the denominators 
are effectively constant over these intervals. 


The reader might notice that many more approximations are required in even very simple wireless 
models than with wired communication. This is partly because the standard linear time invariant 
assumptions of wired communication usually provide straight-forward models, such as the system 
function in (9.3). Wireless systems are usually time-varying, and appropriate models depend very 
much on the time scales of interest. For wireless systems, making the appropriate approximations 
is often more important than subsequent manipulation of equations. 


9.2.4 Reflection from a ground plane 


Consider a transmitting and receiving antenna, both above a plane surface such as a road (see 
Figure 9.4). If the angle of incidence between antenna and road is sufficiently small, then a 
dielectric reflects most of the incident wave, with a sign change. When the horizontal distance 
r between the antennas becomes very large relative to their vertical displacements from the 
ground plane, a very surprising thing happens. In particular, the difference between the direct 
path length and the reflected path length goes to zero as r~! with increasing r. 


When r is large enough, this difference between the path lengths becomes small relative to the 
wavelength c/f of a sinusoid at frequency f. Since the sign of the electric field is reversed on 
the reflected path, these two waves start to cancel each other out. The combined electric field 
at the receiver is then attenuated as r~?, and the received power goes down as r~*. This is 
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Figure 9.4: Illustration of a direct path and a reflected path off of a ground plane 


worked out analytically in Exercise 9.3. What this example shows is that the received power 
can decrease with distance considerably faster than r~? in the presence of reflections. This 
particular geometry leads to an attenuation of r~* rather than multipath fading. 


The above example is only intended to show how attenuation can vary other than with r~? in the 
presence of reflections. Real road surfaces are not perfectly flat and behave in more complicated 
ways. In other examples, power attenuation can vary with r~° or even decrease exponentially 
with r. Also these attenuation effects cannot always be cleanly separated from multipath effects. 


A rapid decrease in power with increasing distance is helpful in one way and harmful in another. 
It is helpful in reducing the interference between adjoining cells, but is harmful in reducing 
the coverage of cells. As cellular systems become increasingly heavily used, however, the major 
determinant of cell size is the number of cell phones in the cell. The size of cells has been steadily 
decreasing in heavily used areas and one talks of micro cells and pico cells as a response to this 
effect. 


9.2.5 Shadowing 


Shadowing is a wireless phenomenon similar to the blocking of sunlight by clouds. It occurs 
when partially absorbing materials, such as the walls of buildings, lie between the sending and 
receiving antennas. It occurs both when cell phones are inside buildings and when outside cell 
phones are shielded from the base station by buildings or other structures. 


The effect of shadow fading differs from multipath fading in two important ways. First, shadow 
fades have durations on the order of multiple seconds or minutes. For this reason, shadow fading 
is often called slow fading and multipath fading is called fast fading. Second, the attenuation 
due to shadowing is exponential in the width of the barrier that must be passed through. Thus 
the overall power attenuation contains not only the r~? effect of free space transmission, but 
also the exponential attenuation over the depth of the obstructing material. 


9.2.6 Moving antenna, multiple reflectors 


Each example with two paths above has used ray tracing to calculate the individual response 
from each path and then added those responses to find the overall response to a sinusoidal input. 
An arbitrary number of reflectors may be treated the same way. Finding the amplitude and 
phase for each path is in general not a simple task. Even for the very simple large wall assumed 
in Figure 9.2, the reflected field calculated in (9.9) is valid only at small distances from the wall 
relative to the dimensions of the wall. At larger distances, the total power reflected from the wall 
is proportional both to rg ? and the cross section of the wall. The portion of this power reaching 
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the receiver is proportional to (ro — r(t))~?. Thus the power attenuation from transmitter to 


receiver (for the reflected wave at large distances) is proportional to [ro(ro — r(t)|~? rather than 
to [2r9 — r(t)]~?. This shows that ray tracing must be used with some caution. Fortunately, 
however, linearity still holds in these more complex cases. 


Another type of reflection is known as scattering and can occur in the atmosphere or in reflections 
from very rough objects. Here the very large set of paths is better modeled as an integral over 
infinitesimally weak paths rather than as a finite sum. 


Finding the amplitude of the reflected field from each type of reflector is important in determining 
the coverage, and thus the placement, of base stations, although ultimately experimentation is 
necessary. Studying this in more depth, however, would take us too far into electromagnetic 
theory and too far away from questions of modulation, detection, and multiple access. Thus we 
now turn our attention to understanding the nature of the aggregate received waveform, given 
a representation for each reflected wave. This means modeling the input/output behavior of a 
channel rather than the detailed response on each path. 


9.3 Input/output models of wireless channels 


This section shows how to view a channel consisting of an arbitrary collection of J electromag- 
netic paths as a more abstract input/output model. For the reflecting wall example, there is a 
direct path and one reflecting path, so J = 2. In other examples, there might be a direct path 
along with multiple reflected paths, each coming from a separate reflecting object. In many 
cases, the direct path is blocked and only indirect paths exist. 

In many physical situations, the important paths are accompanied by other insignificant and 
highly attenuated paths. In these cases, the insignificant paths are omitted from the model and 
J denotes the number of remaining significant paths. 

As in the examples of the previous section, the J significant paths are associated with atten- 
uations and delays due to path lengths, antenna patterns, and reflector characteristics. As 
illustrated in Figure 9.5, the signal at the receiving antenna coming from path j in response to 
an input exp(27ift) is given by 


a, exp{2mif[t — ry 
r3(t) 


The overall response at the receiving antenna to an input exp(27ift) is then 


3 a; exp{2aif[t — aaly 


7G) (9.11) 


ys (t) = 
j=l 


For the example of a perfectly reflecting wall, the combined antenna gain a; on the direct path 
is denoted as a in (9.9). The combined antenna gain ag for the reflected path is —a because 
of the phase reversal at the reflector. The path lengths are r1(t) = ro — vt and ro(t) = ro + vt, 
making (9.11) equivalent to (9.9) for this example. 


For the general case of J significant paths, it is more convenient and general to replace (9.11) 
with an expression explicitly denoting the complex attenuation (;(t) and delay 7;(t) on each 
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Figure 9.5: The reflected path above is represented by a vector c(t) from sending antenna 
to reflector and a vector d(t) from reflector to receiving antenna. The path length r;(t) is 
the sum of the lengths |c(t)| and |d(t)|. The complex function a;(t) is the product of the 
transmitting antenna pattern in the direction toward the reflector, the loss and phase change 
at the reflector, and the receiver pattern in the direction from the reflector. 


path. 
J 
up(t) = D0 B)(t) exp{2ni ft — 74(€)], (9.12) 
j=l 
Bi(t) = aa 7;(t) = ae (9.13) 


Eq. (9.12) can also be used for arbitrary attenuation rates rather than just the 1/r? power loss 
assumed in (9.11). By factoring out the term exp{277 ft}, (9.12) can be rewritten as 


J 
y(t) =A(f,t)exp{2nift} where h(f,t) = 5~ ;(t) exp{—2mifr;(t)}. (9.14) 
j=l 


The function h(f, t) is similar to the system function h(f) of a linear-time-invariant (LTT) system 
except for the variation in t. Thus h(f,t) is called the system function for the linear-time-varying 
(LTV) system (i.e., channel) above. 


The path attenuations (;(t) vary slowly with time and frequency, but these variations are neg- 
ligibly slow over the time and frequency intervals of concern here. Thus a simplified model is 
often used in which each attenuation is simply a constant 3;. In this simplified model, it is also 
assumed that each path delay is changing at a constant rate, 7;(t) = 7? + 7)t. Thus A(f,t) in 
the simplified model is 


J 
A(f,t) = Ss" GB; exp{—2ni ft; (t)} where 1;(t) = 7? + 7; t. (9.15) 
j=l 
This simplified model was used in analyzing the reflecting wall. There, 3; = —(2 = a/ro, 
7 1 = 75) ¢,anud 7, =, = 0/6. 


9.3.1 The system function and impulse response for LTV systems 


The LTV system function h(f,t) in (9.14) was defined for a multipath channel with a finite 
number of paths. A simplified model was defined in (9.15). The system function could also be 
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generalized in a straight-forward way to a channel with a continuum of paths. More generally 
yet, if yp(t) is the response to the input exp{27i ft}, then h(f,t) is defined as jf(t) exp{—2ni ft}. 


In this subsection, h( f,t) exp{27i ft} is taken to be the response to exp{27i ft} for each frequency 
f. The objective is then to find the response to an arbitrary input x(t). This will involve 
generalizing the well-known impulse response and convolution equation of LTI systems to the 
LTV case. 


The key assumption in this generalization is the linearity of the system. That is, if y,(t) and 
y2(t) are the responses to x71 (t) and x2(t) respectively, then a1yi(t) + a2ye2(t) is the response to 


a121(t) + agr2(t). This linearity follows from Maxwell’s equations?. 


Using linearity, the response to a superposition of complex sinusoids, say z(t) = 


[ &(f) exp{27i ft} df, is 


(oe) 
ie = if &(f)h(f, t) exp(2Qri ft) df. (9.16) 
—-cC 
There is a temptation here to blindly imitate the theory of LTI systems and to confuse the Fourier 
transform of y(t), namely g(f), with a(f)A(S, t). This is wrong both logically and physically. It 
is wrong logically because «(f )h( f,t) is a function of t and f, whereas 9(f) is a function only of 
f. It is wrong physically because Doppler shifts cause the response to #(f) exp(277 ft) to contain 
multiple sinusoids around f rather than a single sinusoid at f. From the receiver’s viewpoint, 


g(f) at a given f depends on #(f) over a range of f around f. 


Fortunately, (9.16) can still be used to derive a very satisfactory form of impulse response and 
convolution equation. Define the time-varying impulse response h(r,t) as the inverse Fourier 
transform (in the time variable 7) of h(f,t), where t is viewed as a parameter. That is, for each 
teER, 


hat) = a h(f,t) exp(2nifr) df A(f,t) = [- h(t, t) exp(—2aifr) dr. (9.17) 


Intuitively, h(f,t) is regarded as a conventional LTI system function that is slowly changing 
with ¢ and h(r,t) is regarded as a channel impulse response (in 7) that is slowly changing with 
t. Substituting the second part of (9.17) into (9.16), 


yt) = a “(f) if h(r, t) exp[27i f(t — 7)] ar| df. 
Interchanging the order of integration,® 

y(t) = | * (nt) | i ” 3(f) exp2mif(t — 7)] af dr. 
Identifying the inner integral as x(t — 7), we get the convolution equation for LTV filters, 


WOE / ae nee (9.18) 


—oo 


>Nonlinear effects can occur in high-power transmitting antennas, but we ignore that here. 

® Questions about convergence and interchange of limits will be ignored in this section. This is reasonable since 
the inputs and outputs of interest should be essentially time and frequency limited to the range of validity of the 
simplified multipath model. 
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This expression is really quite nice. It says that the effects of mobile transmitters and receivers, 
arbitrarily moving reflectors and absorbers, and all of the complexities of solving Maxwell’s 
equations, finally reduce to an input/output relation between transmit and receive antennas 
which is simply represented as the impulse response of an LTV channel filter. That is, h(7r, t) 
is the response at time t to an impulse at time t — r. If h(7,t) is a constant function of t, then 
h(t, t), as a function of 7, is the conventional LTI impulse response. 


This derivation applies for both real and complex inputs. The actual physical input x(t) at 
bandpass must be real, however, and for every real x(t), the corresponding output y(t) must 
also be real. This means that the LTV impulse response h(r, t) must also be real. It then follows 
from (9.17) that h(—f,t) = h*(f,t), which defines h(—f,t) in terms of h(f,t) for all f > 0. 


There are many similarities between the results above for LTV filters and the conventional results 
for LTT filters. In both cases, the output waveform is the convolution of the input waveform 
with the impulse response; in the LTI case, y(t) = {x(t — 7)h(r) dr, whereas in the LTV case, 
y(t) = f(t —7)h(7,t) dr. In both cases, the system function is the Fourier transform of the 
impulse response; for LTT filters, h(r) @ h(f) and for LTV filters h(r,t) © A(f,t), é-e., for each 
t the function h(f,t) (as a function of f) is the Fourier transform of h(r,t) (as a function of 
tT). The most significant difference is that g(f) = h(f)&(f) in the LTI case, whereas in the 
LTV case, the corresponding statement says only that y(t) is the inverse Fourier transform of 
A(f,t)a(f). 

It is important to realize that the Fourier relationship between the time-varying impulse re- 
sponse h(7,t) and the time-varying system function h( f,t) is valid for any LTV system and 
does not depend on the simplified multipath model of (9.15). This simplified multipath model is 
valuable, however, in acquiring insight into how multipath and time-varying attenuation affect 
the transmitted waveform. 


For the simplified model of (9.15), h(r,t) can be easily derived from h(f,t) as 


J 
A(f,t) = S° Byexp{-2nifr,(t)} <<  h(r,t) =) 6) 6{7 — HO}, (9.19) 
J 


j=l 


where 0 is the Dirac delta function. Substituting (9.19) into (9.18), 


y(t) = S08; a(t — 7 (t)). (9.20) 
j 


This says that the response at time t to an arbitrary input is the sum of the responses over all 
paths. The response on path j is simply the input, delayed by 7;(t) and attenuated by 3;. Note 
that both the delay and attenuation are evaluated at the time t at which the output is being 
measured. 


The idealized, non-physical, impulses in (9.19) arise because of the tacit assumption that the 
attenuation and delay on each path are independent of frequency. It can be seen from (9.16) 
that h( f,t) affects the output only over the frequency band where i(f) is non-zero. If frequency 
independence holds over this band, it does no harm to assume it over all frequencies, leading to 
the above impulses. For typical relatively narrow-band applications, this frequency independence 
is usually a reasonable assumption. 


Neither the general results about LTV systems nor the results for the multipath models of 
(9.14) and (9.15) provide much immediate insight into the nature of fading. The following 
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two subsections look at this issue, first for sinusoidal inputs, and then for general narrow-band 
inputs. 


9.3.2 Doppler spread and coherence time 


Assuming the simplified model of multipath fading in (9.15), the system function h(f,t) can be 
expressed as 


J 
h(f,t) = S° Bj exp{—2nif (7) t + 7?)} 
j=l 
The rate of change of delay, Te on path 7 is related to the Doppler shift on path 7 at frequency 
f by Dj = —- (55 and thus h( f,t) can be expressed directly in terms of the Doppler shifts as 


J 
h(f,t) =) Bj exp{2ni(Djt — fr?)} 


j=l 


The response to an input exp{27ift} is then 


J 
yp (t) = A(f,t) exp{2ni ft} = S > B; exp{2ni(f + Dj)t — fre} (9.21) 


j=l 


This is the sum of sinusoids around f ranging from f + Dmin to f + Dmax, where Dmin is the 
smallest of the Doppler shifts and Dax is the largest. The terms —27i Lt are simply phases. 


The Doppler shifts D; above can be positive or negative, but can be assumed to be small relative 


to the transmission frequency f. Thus y(t) is a narrow band waveform whose bandwidth is the 
spread between Dyin and Dax. This spread, 


D =maxD; — minD; (9.22) 
j j 


is defined as the Doppler spread of the channel. The Doppler spread is a function of f (since 
all the Doppler shifts are functions of f), but it is usually viewed as a constant since it is 
approximately constant over any given frequency band of interest. 

As shown above, the Doppler spread is the bandwidth of y(t), but it is now necessary to be 
more specific about how to define fading. This will also lead to a definition of the coherence 
time of a channel. 

The fading in (9.21) can be brought out more clearly by expressing h( f,t) in terms of its 
magnitude and phase, i.e., as |h(f,t)| e’4"(f, The response to exp{2mift} is then 


h(f,t)| exp{2rift + iZh(f,t)}. (9.23) 


y(t) = 


This expresses y(t) as an amplitude term |h(f,t)| times a phase modulation of magnitude 1. 
This amplitude term |h( f,t)| is now defined as the fading amplitude of the channel at frequency 
f. As explained above, |h(f,t)| and Zh(f,t) are slowly varying with t relative to exp{2mift}, 
so it makes sense to view |h( f,t)| as a slowly varying envelope, 7.e., a fading envelope, around 
the received phase modulated sinusoid. 
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The fading amplitude can be interpreted more clearly in terms of the response R[y(t)] to an 
actual real input sinusoid cos(27 ft) = Rlexp(277ft)]. Taking the real part of (9.23), 


Rye (t)] = |ACf, t)| cos[2r ft + Zh(f, t)]. 


The waveform R[ys(t)] oscillates at roughly the frequency f inside the slowly varying limits 
+|h(f,t)|. This shows that|h(f,t)| is also the envelope, and thus the fading amplitude, of 
Ry p(t)] (at the given frequency f). This interpretation will be extended later to narrow band 
inputs around the frequency f. 

We have seen from (9.21) that D is the bandwidth of y,(t), and it is also the bandwidth of 
R[y-(t)]. Assume initially that the Doppler shifts are centered around 0, i.e., that Dmax = 


—Dymin- Then h( f,t) is a baseband waveform containing frequencies between —D/2 and +D/2. 
The envelope of R[yr(t)], namely |h(f,t)|, is the magnitude of a waveform baseband limited to 
D/2. For the reflecting wall example, Dj = —D»2, the Doppler spread is D = 2D , and the 
envelope is | sin[27(D/2)t]|. 

More generally, the Doppler shifts might be centered around some non-zero A defined as the 
midpoint between min;D; and max;D;. In this case, consider the frequency shifted system 
function w( f,t) defined as 


J 


b(f,t) = exp(—2mitA) h(f,t) = 5 3; exp{2mit(Dj—A) — 2nifr?} (9.24) 
j=l 


As a function of t, ¢)(f,t) has bandwidth D/2. Since 


[DCF A] = le PO" ACEO] = ACEO, 


the envelope of R[ys(t)] is the same as’ the magnitude of wf, t), t.e., the magnitude of a 
waveform baseband limited to D/2. Thus this limit to D/2 is valid independent of the Doppler 
shift centering. 


As an example, assume there is only one path and its Doppler shift is Dy. Then h( f,t) isa 
complex sinusoid at frequency D1, but |h( f,t)| is a constant, namely |(3)|. The Doppler spread is 
0, the envelope is constant, and there is no fading. As another example, suppose the transmitter 
in the reflecting wall example is moving away from the wall. This decreases both of the Doppler 
shifts, but the difference between them, namely the Doppler spread, remains the same. The 
envelope |h(f,#)| then also remains the same. Both of these examples illustrate that it is the 
Doppler spread rather than the individual Doppler shifts that controls the envelope. 


Define the coherence time Teoy, of the channel to be® 


1 


Teoh = << 
coh 2D’ 


(9.25) 


This is one quarter of the wavelength of D/2 (the maximum frequency in wo f,t)) and one 
half the corresponding sampling interval. Since the envelope is |w(f,t)|, Zeon serves as a crude 


"Note that wf, t), as a function of t, is baseband limited to D/2, whereas Af, t) is limited to frequencies 
within D/2 of A and g(t) is limited to frequencies within D/2 of f+A. It is rather surprising initially that all 
these waveforms have the same envelope. We focus on #(f,t) = e-?"44h(f,t) since this is the function that 
is baseband limited to D/2. Exercises 6.17 and 9.5 give additional insight and clarifying examples about the 
envelopes of real passband waveforms. 

’Some authors define J.., as 1/(4D) and others as 1/D; these have the same order-of-magnitude interpretations. 
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order-of-magnitude measure of the typical time interval for the envelope to change significantly. 
Since this envelope is the fading amplitude of the channel at frequency f, Tcon is fundamentally 
interpreted as the order-of-magnitude duration of a fade at f. Since D is typically less than 
1000H, Zeon is typically greater than 1/2 msec. 


Although the rapidity of changes in a baseband function cannot be specified solely in terms 
of its bandwidth, high bandwidth functions tend to change more rapidly than low bandwidth 
functions; the definition of coherence time captures this loose relationship. For the reflecting 
wall example, the envelope goes from its maximum value down to 0 over the period Ton; this is 
more or less typical of more general examples. 


Crude though 7.., might be as a measure of fading duration, it is an important parameter 
in describing wireless channels. It is used in waveform design, diversity provision, and chan- 
nel measurement strategies. Later, when stochastic models are introduced for multipath, the 
relationship between fading duration and 7,., will become sharper. 


It is important to realize that Doppler shifts are linear in the input frequency, and thus Doppler 
spread is also. For narrow band inputs, the variation of Doppler spread with frequency is 
insignificant. When comparing systems in different frequency bands, however, the variation of 
D with frequency is important. For example, a system operating at 8 gH has a Doppler spread 
8 times that of a 1 gH system and thus a coherence time 1/8th as large; fading is faster, with 
shorter fade durations, and channel measurements become outdated 8 times as fast. 


9.3.3 Delay spread, and coherence frequency 


Another important parameter of a wireless channel is the spread in delay between different 
paths. The delay spread CL is defined as the difference between the path delay on the longest 
significant path and that on the shortest significant path. That is, 


£ = max{rj(t)] — min{rj(2)). 


The difference between path lengths is rarely greater than a few kilometers, so CL is rarely 
more than several microseconds. Since the path delays 7;(t) are changing with time, £ can also 
change with time, so we focus on £ at some given t. Over the intervals of interest in modulation, 
however, £ can usually be regarded as a constant.? 


A closely related parameter is the coherence frequency of a channel. It is defined as!® 


1 
Ya 
The coherence frequency is thus typically greater than 100 kH. This section shows that Foon 
provides an approximate answer to the following question: if the channel is badly faded at one 
frequency f, how much does the frequency have to be changed to find an unfaded frequency? 
We will see that, to a very crude approximation, f must be changed by Foon. 


Foon = (9.26) 


The analysis of the parameters £ and Fon is, in a sense, a time/frequency dual of the analysis of 
D and T.oh. More specifically, the fading envelope of t[y¢(t)] (in response to the input cos(27 ft) ) 


°For the reflecting wall example, the path lengths are ro — vt and ro + vt, so the delay spread is L = 2ut/c. 
The change with t looks quite significant here, but at reasonable distances from the reflector, the change is small 
relative to typical intersymbol intervals. 

0F.on is sometimes defined as 1/£ and sometimes as 1/(4L); the interpretation is the same. 
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is |A(f,t)|. The analysis of D and J. concerned the variation of |A(f,t)| with t. That of £ and 
Feon concern the variation of |h(f,t)| with f. 


In the simplified multipath model of (9.15), ACf, Ae a 0; exp{—2nif7;(t)}. For fixed t, this 
is a weighted sum of J complex sinusoidal terms in the variable f. The ‘frequencies’ of these 
terms, viewed as functions of f, are 7)(t),... ,7,(t). Let tniq be the midpoint between min; 7; (t) 
and max; 7;(t) and define the function 7(f,¢) as 


ACF, t) = Pra A(f,t) = SB; exp{—2nif[7;(t) — tmial}, (9.27) 
J 


The shifted delays, 7;(t) — Tia, vary with j from —L/2 to +£/2. Thus 7(f,t), as a function of 
f, has a ‘baseband bandwidth’! of £/2. From (9.27), we see that |h(f,t)| = |A(f,t)|. Thus the 
envelope |h(f,t)|, as a function of f, is the magnitude of a function ‘baseband limited’ to £/2. 


It is then reasonable to take 1/4 of a ‘wavelength’ of this bandwidth, i.e., Foon = 1/(2L), as 
an order-of-magnitude measure of the required change in f to cause a significant change in the 
envelope of R[y(t)]. 


The above argument relating £ to Foon is virtually identical to that relating D to Toon. The 
interpretations of Z,,, and Fo, as order-of-magnitude approximations are also virtually iden- 
tical. The duality here, however, is between the ¢ and f in h(f,t) rather than between time 
and frequency for the actual transmitted and received waveforms. The envelope |h(f,#)| used in 
both of these arguments can be viewed as a short-term time-average in |t[ys(t)]| (see Exercise 
9.6 (b)), and thus F,op is interpreted as the frequency change required for significant change in 
this time-average rather than in the response itself. 


One of the major questions faced with wireless communication is how to spread an input signal 
or codeword over time and frequency (within the available delay and frequency constraints). If 
a signal is essentially contained both within a time interval 7,,}, and a frequency interval Fon, 
then a single fade can bring the entire signal far below the noise level. If, however, the signal 
is spread over multiple intervals of duration 7.,, and/or multiple bands of width Fon, then a 
single fade will affect only one portion of the signal. Spreading the signal over regions with 
relatively independent fading is called diversity, which is studied later. For now, note that the 
parameters Toh and Feoh tell us how much spreading in time and frequency is required for using 
such diversity techniques. 


In earlier chapters, the receiver timing has been delayed from the transmitter timing by the 
overall propagation delay; this is done in practice by timing recovery at the receiver. Timing 
recovery is also used in wireless communication, but since different paths have different propa- 
gation delays, timing recovery at the receiver will approximately center the path delays around 
0. This means that the offset Tmiq in (9.27) becomes zero and the function 7(f,t) = h(f,t). 
Thus 7(f,t) can be omitted from further consideration and it can be assumed without loss of 
generality that h(7,t) is nonzero only for |r| < L/2. 


Next consider fading for a narrow-band waveform. Suppose that 2(t) is a transmitted real 
passband waveform of bandwidth W around a carrier f,. Suppose moreover that W < Foon. 
Then A(f,t) me At fad) for fe--W/2 < f < f-+W/2. Let «t(t) be the positive frequency part of 
x(t), so that @*(f) is nonzero only for f.—-W/2 < f < f-+W/2. The response y*(t) to x*(t) is 
given by (9.16) as yt(t) = Jrso a(f)A(f, t)e2"'f* df and is thus approximated as 


"Tn other words, the inverse Fourier transform, h(tT—Tmia,t) is nonzero only for |7—Tmia| < £/2. 
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fet W/2 P pistes r 
y(t) ~ | BSP)h(fe ther df = a* (t)h( fe, t). 
Taking the real part to find the response y(t) to x(t), 


y(t) © [AC fe, #)| Rla* (tether), (9.28) 


In other words, for narrow-band communication, the effect of the channel is to cause fading with 
envelope |h(fe,t)| and with phase change Zh(fo,t). This is called flat fading or narrow-band 
fading. The coherence frequency Feo, defines the boundary between flat and non-flat fading, 
and the coherence time 7.op gives the order-of-magnitude duration of these fades. 


The flat-fading response in (9.28) looks very different from the general response in (9.20) as a 
sum of delayed and attenuated inputs. The signal bandwidth in (9.28), however, is so small 
that if we view x(t) as a modulated baseband waveform, that baseband waveform is virtually 
constant over the different path delays. This will become clearer in the next section. 


9.4 Baseband system functions and impulse responses 


The next step in interpreting LTV channels is to represent the above bandpass system function 
in terms of a baseband equivalent. Recall that for any complex waveform u(t), baseband limited 
to W/2, the modulated real waveform x(t) around carrier frequency f, is given by 


x(t) = u(t) exp{2nif.t} + u*(t) exp{—27if.t}. 
Assume in what follows that f. >> W/2. 


In transform terms, i(f) = u(f — f.) + a*(—f + f-). The positive-frequency part of x(t) is 
simply u(t) shifted up by fc. To understand the modulation and demodulation in simplest terms, 
consider a baseband complex sinusoidal input e?"/¢ for f € [-W/2, W/2] as it is modulated, 
transmitted through the channel, and demodulated (see Figure 9.6). Since the channel may 
be subject to Doppler shifts, the recovered carrier, f,, at the receiver might be different than 
the actual carrier f,. Thus, as illustrated, the positive-frequency channel output is y(t) = 


i(f+fe,t) 2t(f+fet and the demodulated waveform is h( f+ fe, t) e2™!f+fe-fe)t, 


For an arbitrary baseband-limited input, u(t) = f ye 


_w/2 ti f e27*I* df, the positive-frequency chan- 
nel output is given by superposition as 


W/2 m ; 
y*(t) = c a f)A(ftfe, t) 2 tfelt ap. 
—W/2 


The demodulated waveform, v(t), is then y+(t) shifted down by the recovered carrier fo, i.e., 
w/2 : . : 
v(t) = | (FAC F+ fort) PMU FO af. 
_—w/2 


Let A be the difference between recovered and transmitted carrier,!? i.e., A = fe — fe. Thus 


w/ : 
v(t) = i. ” aC AP fest) etl f—A)t ge (9.29) 


—W/2 


"Tt might be helpful to assume A = 0 on a first reading. 
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e2tift [baseband e2ti(f+fe)t 
to passband 
Channel 
multipath 
h( f+fe,t) 
ch VGN 
i z Z(t) =0 
hf +fes pert ee) passband — |, h(ftfe: £) eet ese ®) 


to baseband 


Figure 9.6: A complex baseband sinusoid, as it is modulated to passband, passed through 
a multipath channel, and demodulated without noise. The modulation is around a carrier 
frequency f- and the demodulation is in general at another frequency fe. 


The relationship between the input u(t) and the output v(t) at baseband can be expressed 
directly in terms of a baseband system function g(f,t) defined as 


a(f,t) = A(f+fe,t)e 2". (9.30) 
Then (9.29) becomes 
W/2 
ae / al fal f.t) 2m af. (9.31) 
—W/2 


This is exactly the same form as the passband input-output relationship in (9.16). Letting 
g(7,t) = f{ o(f,t)e2"!7 df be the LTV baseband impulse response, the same argument as used 
to derive the passband convolution equation leads to 


‘y= i Tinie tate (9.32) 


—oco 


The interpretation of this baseband LTV convolution equation is the same as that of the passband 
LTV convolution equation in (9.18). For the simplified multipath model of (9.15), A(f,t) = 
Sy (; exp{—2rif7;(t)} and thus, from (9.30), the baseband system function is 


J 
i= S> Bj exp{—2ri( f+ fo)T;(t) — 2riAt}. (9.33) 


j=l 


We can separate the dependence on ¢ from that on f by rewriting this as 
J 
= So y(t) exp{—27i f7;(t)} where ;(t) = (0; exp{—27if.7;(t) — 2niAt}. (9.34) 


Taking the inverse Fourier transform for fixed t, the LTV baseband impulse response is 


= Lt ) d{7—7;(t)}. (9.35) 
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Thus the impulse response at a given receive-time t is a sum of impulses, the jth of which is 
delayed by 7;(t) and has an attenuation and phase given by y;(t). Substituting this impulse 
response into the convolution equation, the input-output relation is 


v(t) = do a(t) ult-7)(0)). 
j 


This baseband representation can provide additional insight about Doppler spread and coherence 
time. Consider the system function in (9.34) at f = 0 (i.e., at the passband carrier frequency). 
Letting D; be the Doppler shift at f, on path j, we have 7;(t) = TP — Dyt /fe- Then 


J 
9(0,t) = S°4(t) where ;(t) = Bj exp{2mi[Dj — A]t — 2rif.79}. 
j=l 


The carrier recovery circuit estimates the carrier frequency from the received sum of Doppler 
shifted versions of the carrier, and thus it is reasonable to approximate the shift in the recovered 
carrier by the midpoint between the smallest and largest Doppler shift. Thus g(0,t) is the same 
as the frequency-shifted system function w( fc, t) of (9.24). In other words, the frequency shift 
A, which was introduced in (9.24) as a mathematical artifice, now has a physical interpretation 
as the difference between f. and the recovered carrier fe We see that g(0,t) is a waveform with 
bandwidth D/2, and that T.o, = 1/(2D) is an order-of-magnitude approximation to the time 
over which g(0,t) changes significantly. 


Next consider the baseband system function g(f,t) at baseband frequencies other than 0. Since 
W < f., the Doppler spread at f. + f is approximately equal to that at f., and thus g(f,t), as 
a function of t for each f < W/2, is also approximately baseband limited to D/2 (where D is 
defined at f = fc). 


Finally, consider flat fading from a baseband perspective. Flat fading occurs when W < Feoh, 
and in this case!’ g(f,t) + g(0,t). Then, from (9.31), 


u(t) = g(0, tu(t). (9.36) 


In other words, the received waveform, in the absence of noise, is simply an attenuated and phase 
shifted version of the input waveform. If the carrier recovery circuit also recovers phase, then 
v(t) is simply an attenuated version of u(t). For flat fading, then, Zon is the order-of-magnitude 
interval over which the ratio of output to input can change significantly. 


In summary, this section has provided both a passband and baseband model for wireless com- 
munication. The basic equations are very similar, but the baseband model is somewhat easier 
to use (although somewhat more removed from the physics of fading). The ease of use comes 
from the fact that all the waveforms are slowly varying and all are complex. This can be seen 
most clearly by comparing the flat-fading relations, (9.28) for passband and (9.36) for baseband. 


9.4.1 A discrete-time baseband model 


This section uses the sampling theorem to convert the above continuous-time baseband channel 
to a discrete-time channel. If the baseband input u(t) is bandlimited to W/2, then it can be 


There is an important difference between saying that the Doppler spread at frequency f+fc is close to that 
at f. and saying that g(f,t) ~ g(0,t). The first requires only that W be a relatively small fraction of fc, and is 
reasonable even for W = 100 mH and f. = 1gH, whereas the second requires W < Feon, which might be on the 
order of hundreds of kH. 
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represented by its T-spaced samples, T' = 1/W, as u(t) = >>, uesine(# — £), where up = u(£T). 
Using (9.32), the baseband output is given by 


o(t) = Sow i; avenue Pa P Nee (9.37) 
l 
The sampled outputs, Um = v(mT), at multiples of T are then given by!* 


Ui. = Ss" Ue [ot mT) sinc(m — £—17/T) dr (9.38) 
£L 


Ss" ie [ot mT) sinc(k — 7/T) dr, . (9.39) 
k 


where k = m—£. By labeling the above integral as gx m, (9.39) can be written in the discrete-time 
form 


Um = SS Gem Um—k where 9m = [ot mT) sinc(k — r/T) dr. (9.40) 
k 


In discrete-time terms, gx,m is the response at mT to an input sample at (m—k)T. We refer 
tO gx,m as the kth (complex) channel filter tap at discrete output time mT’. This discrete-time 
filter is represented in Figure 9.7. As discussed later, the number of channel filter taps (7.e., 


input_|| 


Figure 9.7: Time-varying discrete-time baseband channel model. Each unit of time a new 

input enters the shift register and the old values shift right. The channel taps also change, 

but slowly. Note that the output timing here is offset from the input timing by two units. 
different values of k) for which gz m is significantly non-zero is usually quite small. If the kth 
tap is unchanging with m for each k, then the channel is linear time-invariant. If each tap 
changes slowly with m, then the channel is called slowly time-varying. Cellular systems and 
most wireless systems of current interest are slowly time-varying. 


The filter tap gm for the simplified multipath model is obtained by substituting (9.35), i.e., 
g(t, t) = 37; 7; (t) {7-7 (t) }, into the second part of (9.40), getting 


9kym = S > (mT) sinc [ _ aa : (9.41) 
J 


Due to Doppler spread, the bandwidth of the output v(t) can be slightly larger than the bandwidth W/2 
of the input u(t). Thus the output samples v,, do not fully represent the output waveform. However, a QAM 
demodulator first generates each output signal vm corresponding to the input signal um, so these output samples 
are of primary interest. A more careful treatment would choose a more appropriate modulation pulse than a 
sinc function and then use some combination of channel estimation and signal detection to produce the output 
samples. This is beyond our current interest. 
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The contribution of path 7 to tap k can be visualized from Figure 9.8. If the path delay equals 
kT for some integer k, then path 7 contributes only to tap k, whereas if the path delay lies 
between kT and (k+1)T, it contributes to several taps around k and k+1. 


sinc(k — 7;(mT)/T) 


Figure 9.8: This shows sinc(k — 7;(mt)/T), as a function of k, marked at integer values of k. 
In the illustration, 7;(mt)/T) = 0.8. The figure indicates that each path contributes primarily 
to the tap or taps closest to the given path delay. 


The relation between the discrete-time and continuous-tme baseband models can be better 
understood by observing that when the input is baseband limited to W/2, then the baseband 
system function g(f,t) is irrelevant for f > W/2. Thus an equivalent filtered system function 
gy (f,t) and impulse response g,,(7,t) can be defined by filtering out the frequencies above W/2, 
1.€., 


Ow(F, t) = o(f, t)rect(f /W) Gy (7, t) = g(T, t) * Wsinc(7W). (9.42) 


Comparing this with the second half of (9.40), we see that the tap gains are simply scaled sample 
values of the filtered impulse response, 7.e., 


Gem = Tgy(kT, mT). (9.43) 


For the simple multipath model, the filtered impulse response replaces the impulse at 7;(t) by a 
scaled sinc function centered at 7;(t) as illustrated in Figure 9.8. 


Now consider the number of taps required in the discrete time model. The delay spread, L, 
is the interval between the smallest and largest path delay!® and thus there are about £L/T 
taps close to the various path delays. There are a small number of additional significant taps 
corresponding to the decay time of the sinc function. In the special case where £/T is much 
smaller than 1, the timing recovery will make all the delay terms close to 0 and the discrete-time 
model will have only one significant tap. This corresponds to the flat-fading case we looked at 
earlier. 


The coherence time J.o, provides a sense of how fast the individual taps gz. are changing 
with respect to m. If a tap gkm is affected by only a single path, then |gz,m| will be virtually 
unchanging with m, although Zg%,m can change according to the Doppler shift. If a tap is 
affected by several paths, then its magnitude can fade at a rate corresponding to the spread of 
the Doppler shifts affecting that tap. 


Technically, £ varies with the output time t, but we generally ignore this since the variation is slow and L 
has only an order-of-magnitude significance. 
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9.5 Statistical channel models 


The previous subsection created a discrete-time baseband fading channel in which the individual 
tap gains gx m in (9.41) are scaled sums of the attenuation and smoothed delay on each path. The 
physical paths are unknown at the transmitter and receiver, however, so from an input/output 
viewpoint, it is the tap gains themselves!® that are of primary interest. Since these tap gains 
change with time, location, bandwidth, carrier frequency, and other parameters, a statistical 
characterization of the tap gains is needed in order to understand how to communicate over 
these channels. This means that each tap gain gx. should be viewed as a sample value of a 
random variable Gym. 


There are many approaches to characterizing these tap-gain random variables. One would be to 
gather statistics over a very large number of locations and conditions, and then model the joint 
probability densities of these random variables according to these measurements, and do this 
conditionally on various types of locations (cities, hilly areas, flat areas, highways, buildings, 
etc.). Much data of this type has been gathered, but it is more detailed than what is desirable 
to achieve an initial understanding of wireless issues. 


Another approach, which is taken here and in virtually all the theoretical work in the field, is 
to choose a few very simple probability models that are easy to work with, and then use the 
results from these models to gain insight about actual physical situations. After presenting the 
models, we discuss the ways in which the models might or might not reflect physical reality. 
Some standard results are then derived from these models, along with a discussion of how they 
might reflect actual performance. 


In the Rayleigh tap-gain model, the real and imaginary parts of all the tap gains are taken to be 
zero-mean jointly-Gaussian random variables. Each tap gain Gm is thus a complex Gaussian 
random variable which is further assumed to be circularly symmetric, 7.e., to have iid real and 
imaginary parts. Finally it is assumed that the probability density of each Gzm is the same for 
all m. We can then express the probability density of Gz m as 


—g2. — 2 
exp { Sn | (9.44) 


FRG 1m) 8(Grmm) Gres Fim) = 207 


2roz 
where of is the variance of R(Gy.m) (and thus also of S(Gy.m)) for each m. We later address 
how these rv’s are related between different m and k. 


As shown in Exercise 7.1, the magnitude |G%,m| of the k’” tap is a Rayleigh rv with density 


2 
ficnmi(lal) = exp { SS} (9.45) 
This model is called the Rayleigh fading model. Note from (9.44) that the model includes a 
uniformly distributed phase that is independent of the Rayleigh distributed amplitude. The 
assumption of uniform phase is quite reasonable, even in a situation with only a small number 
of paths, since a quarter wavelength at cellular frequencies is only a few inches. Thus even with 
fairly accurately specified path lengths, we would expect the phases to be modeled as uniform 


'6\Many wireless channels are characterized by a very small number of significant paths, and the corresponding 
receivers track these individual paths rather than using a receiver structure based on the discrete-time model. 
The discrete-time model is none-the-less a useful conceptual model for understanding the statistical variation of 
multiple paths. 
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and independent of each other. This would also make the assumption of independence between 
tap-gain phase and amplitude reasonable. 


The assumption of Rayleigh distributed amplitudes is more problematic. If the channel involves 
scattering from a large number of small reflectors, the central limit theorem would suggest a 
jointly Gaussian assumption for the tap gains,!” thus making (9.44) reasonable. For situations 
with a small number of paths, however, there is no good justification for (9.44) or (9.45). 


There is a frequently used alternative model in which the line of sight path (often called a specular 
path) has a known large magnitude, and is accompanied by a large number of independent 
smaller paths. In this case, gz,m, at least for one value of k, can be modeled as a sample value of 
a complex Gaussian rv with a mean (corresponding to the specular path) plus real and imaginary 
iid fluctuations around the mean. The magnitude of such a rv has a Rician distribution. Its 
density has quite a complicated form, but the error probability for simple signaling over this 
channel model is quite simple and instructive. 


The preceding paragraphs make it appear as if a model is being constructed for some known 
number of paths of given character. Much of the reason for wanting a statistical model, however, 
is to guide the design of transmitters and receivers. Having a large number of models means 
investigating the performance of given schemes over all such models, or measuring the channel, 
choosing an appropriate model, and switching to a scheme appropriate for that model. This is 
inappropriate for an initial treatment, and perhaps inappropriate for design, returning us to the 
Rayleigh and Rician models. One reasonable point of view here is that these models are often 
poor approximations for individual physical situations, but when averaged over all the physical 
situations that a wireless system must operate over, they make more sense.!® At any rate, these 
models provide a number of insights into communication in the presence of fading. 


Modeling each gz,m as a sample value of a complex rv Gz,m provides part of the needed statistical 
description, but this is not the only issue. The other major issue is how these quantities vary 
with time. In the Rayleigh fading model, these random variables have zero mean, and it will 
make a great deal of difference to useful communication techniques if the sample values can be 
estimated in terms of previous values. A statistical quantity that models this relationship is 
known as the tap-gain correlation function, R(k, A). It is defined as 


R(k,n) = E[GemGomnal: (9.46) 


This gives the autocorrelation function of the sequence of complex random variables, modeling 
each given tap k as it evolves in time. It is tacitly assumed that this is not a function of time m, 
which means that the sequence {Gz,;m € Z} for each k is assumed to be wide-sense stationary. 
It is also assumed that, as a random variable, Gy,m is independent of Gym for all k 4 k’ and 
all m,m’. This final assumption is intuitively plausible!? since paths in different ranges of delay 
contribute to Gxm for different values of k. 


The tap-gain correlation function is useful as a way of expressing the statistics for how tap gains 
change, given a particular bandwidth W. It does not address the questions comparing different 


'7Tn fact, much of the current theory of fading was built up in the 1960s when both space communication and 
military channels of interest then were well modeled as scattering channels with a very large number of small 
reflectors. 

18This is somewhat oversimplified. As shown in Exercise 9.9, a random choice of a small number of paths from 
a large possible set does not necessarily lead to a Rayleigh distribution. There is also the question of an initial 
choice of power level at any given location. 

One could argue that a moving path would gradually travel from the range of one tap to another. This is 
true, but the time intervals for such changes are typically large relative to the other intervals of interest. 
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bandwidths for communication. If we visualize increasing the bandwidth, several things happen. 
First, since the taps are separated in time by 1/W, the range of delay corresponding to a single 
tap becomes narrower. Thus there are fewer paths contributing to each tap, and the Rayleigh 
approximation can in many cases become poorer. Second, the sinc functions of (9.41) become 
narrower, so the path delays spill over less in time. For this same reason, R(k,0) for each k gives a 
finer grained picture of the amount of power being received in the delay window of width k/W. In 
summary, as this model is applied to larger W, more detailed statistical information is provided 
about delay and correlation at that delay, but the information becomes more questionable. 


In terms of R(k, A), the multipath spread £ might be defined as the range of kT over which 
R(k,0) is significantly non-zero. This is somewhat preferable to the previous “definition” in 
that the statistical nature of £ becomes explicit and the reliance on some sort of stationarity 
becomes explicit. In order for this definition to make much sense, however, the bandwidth W 
must be large enough for several significant taps to exist. 


The coherence time 7,,, can also be defined more explicitly as mT for the smallest value of 
A > 0 for which R(0, A) is significantly different from R(0,0). Both these definitions maintain 
some ambiguity about what ‘significant’ means, but they face the reality that £ and 7,,, should 
be viewed probabilistically rather than as instantaneous values. 


9.5.1 Passband and baseband noise 


The statistical channel model above focuses on how multiple paths and Doppler shifts can affect 
the relationship between input and output, but the noise and the interference from other wireless 
channels have been ignored. The interference from other users will continue to be ignored (except 
for regarding it as additional noise), but the noise will now be included. 


Assume that the noise is WGN with power WNp over the bandwidth W. The earlier convention 
will still be followed of measuring both signal power and noise power at baseband. Extending 
the deterministic baseband input/output model vm = >>), Gk,mUm—k to include noise as well as 
randomly varying gap gains, 


Vin = >_ GramUm—k + Zm; (9.47) 
k 
where ... ,Z_1, 20, Z1,... , is a sequence of iid circularly symmetric complex Gaussian random 


variables. Assume also that the inputs, the tap gains, and the noise are statistically independent 
of each other. 


The assumption of WGN essentially means that the primary source of noise is at the receiver 
or is radiation impinging on the receiver that is independent of the paths over which the signal 
is being received. This is normally a very good assumption for most communication situations. 
Since the inputs and outputs here have been modeled as samples at rate W of the baseband 
processes, we have E[|U,,|?] = P where P is the baseband input power constraint. Similarly, 
E[|Zm|?] = NoW. Each complex noise rv is thus denoted as Z, ~ CN’(0, W No) 


The channel tap gains will be normalized so that V,/, = 37, GemUm_—xr satisfies E[|V/|?] = P. It 
can be seen that this normalization is achieved by 


E[S~|Gxol?] = 1. (9.48) 


k 
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This assumption is similar to our earlier assumption for the ordinary (non-fading) WGN channel 
that the overall attenuation of the channel is removed from consideration. In other words, both 
here and there we are defining signal power as the power of the received signal in the absence 
of noise. This is conventional in the communication field and allows us to separate the issue of 
attenuation from that of coding and modulation. 


It is important to recognize that this assumption cannot be used in a system where feedback 
from receiver to transmitter is used to alter the signal power when the channel is faded. 


There has always been a certain amount of awkwardness about scaling from baseband to pass- 
band, where the signal power and noise power each increase by a factor of 2. Note that we have 
also gone from a passband channel filter H(f,t) to a baseband filter G(f,t) using the same con- 
vention as used for input and output. It is not difficult to show that if this property of treating 
signals and channel filters identically is preserved, and the convolution equation is preserved at 
baseband and passband, then losing a factor of 2 in power is inevitable in going from passband 
to baseband. 


9.6 Data detection 


A reasonable approach to detection for wireless channels is to measure the channel filter taps 
as they evolve in time, and to use these measured values in detecting data. If the response can 
be measured accurately, then the detection problem becomes very similar to that for wireline 
channels; i.e., detection in WGN. 


Even under these ideal conditions, however, there are a number of problems. For one thing, 
even if the transmitter has perfect feedback about the state of the channel, power control is a 
difficult question; namely, how much power should be sent as a function of the channel state? 


For voice, both maintaining voice quality and maintaining small constant delay is important. 
This leads to a desire to send information at a constant rate, which in turn leads to increased 
transmission power when the channel is poor. This is very wasteful of power, however, since 
common sense says that if power is scarce and delay is unimportant, then the power and trans- 
mission rate should be decreased when the channel is poor. 


Increasing power when the channel is poor has a mixed impact on interference between users. 
This strategy maintains equal received power at a base station for all users in the cell corre- 
sponding to that base station. This helps reduce the effect of multiaccess interference within the 
same cell. The interference between neighboring cells can be particularly bad, however, since 
fading on the channel between a cell phone and its base station is not highly correlated with 
fading between that cell phone and another base station. 


For data, delay is less important, so data can be sent at high rate when the channel is good, 
and at low rate (or zero rate) when the channel is poor. There is a straightforward information- 
theoretic technique called water filling that can be used to maximize overall transmission rate 
at a given overall power. The scaling assumption that we made above about input and output 
power must be modified for all of these issues of power control. 


An important insight from this discussion is that the power control used for voice should be very 
different from that for data. If the same system is used for both voice and data applications, 
then the basic mechanisms for controlling power and rate should be very different for the two 
applications. 
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In this section, power control and rate control are not considered, and the focus is simply on 
detecting signals under various assumptions about the channel and the state of knowledge at 
the receiver. 


9.6.1 Binary detection in flat Rayleigh fading 


Consider a very simple example of communication in the absence of channel measurement. 
Assume that the channel can be represented by a single discrete-time complex filter tap Gom, 
which we abbreviate as Gj. Also assume Rayleigh fading; 7.e., the probability density of the 
magnitude of each G,,, is 


fiem| (gl) = 2lglexp{-lgl?} sgl = 9, (9.49) 
or, equivalently, the density of y = |G,|? > 0 is 
f(y) =exp(-y) 5 -¥ 20. (9.50) 


The phase is uniform over [0, 27) and independent of the magnitude. Equivalently, the real and 
imaginary parts of G,,, are iid Gaussian, each with variance 1/2. The Rayleigh fading has been 
scaled in this way to maintain equality between the input power, E[|U;,|7], and the output signal 
power, E[|Uj,|? |Gm|?]. It is assumed that U;, and G, are independent, i.e., that feedback is 
not used to control the input power as a function of the fading. For the time being, however, 
the dependence between the taps G,, at different times m is not relevant. 


This model is called flat fading for the following reason. A single-tap discrete-time model, where 
v(mT) = gomu(mT), corresponds to a continuous-time baseband model for which g(7,t) = 
g(0,t)sinc(7/T). Thus the baseband system function for the channel is given by g(f,t) = 
go(t)rect(fT). Thus the fading is constant (i.e., flat) over the baseband frequency range used 
for communication. When more than one tap is required, the fading varies over the baseband 
region. To state this another way, the flat fading model is appropriate when the coherence 
frequency is greater than the baseband bandwidth. 


Consider using binary antipodal signaling with U,, = +a for each m. Assume that {U;,; m € Z} 
is an iid sequence with equiprobable use of plus and minus a. This signaling scheme fails 
completely, even in the absence of noise, since the phase of the received symbol is uniformly 
distributed between 0 and 27 under each hypothesis, and the received amplitude is similarly 
independent of the hypothesis. It is easy to see that phase modulation is similarly flawed. In 
fact, signal structures must be used in which either different symbols have different magnitudes, 
or, alternatively, successive signals must be dependent.?° 


Next consider a form of binary pulse-position modulation where, for each pair of time-samples, 
one of two possible signal pairs, (a,0) or (0,a), is sent. (This has the same performance as a 
number of binary orthogonal modulation schemes such as minimum shift keying (see Exercise 
8.16)), but is simpler to describe in discrete time. The output is then 


Wet 27. aes (9.51) 


where, under one hypothesis, the input signal pair is U = (a,0), and under the other hypothesis, 
U = (0,a). The noise samples, {Z,,;m € Z} are iid circularly symmetric complex Gaussian 


20For example, if the channel is slowly varying, differential phase modulation, where data is sent by the difference 
between the phase of successive signals, could be used. 
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random variables, Z, ~ CN(0,NoW). Assume for now that the detector looks only at the 
outputs Vo and Vj. 


Given U = (a,0), Vo = aGo + Zp is the sum of two independent complex Gaussian random 
variables, the first with variance a?/2 per dimension, and the second with variance NoW/2 per 
dimension. Thus, given U = (a,0), the real and imaginary parts of Vo are independent, each 
N (0, a?/2 + NoW/2). Similarly, given U = (a,0), the real and imaginary parts of Vj = Z; 
are independent, each (0, NoW/2). Finally, since the noise variables are independent, Vo and 
V, are independent (given U = (a,0)). The joint probability density?! of (Vo, V1) at (vo, v1), 
conditional on hypothesis U = (a,0), is therefore 


1 vol? lvl? 
= : 9.52 
Jo(vo.%) = aya (Gz 2+ WN /2)(WNo/2) exp { a2+WNo  WNo pe) 
where fo denotes the conditional density given hypothesis U=(a,0). Note that the density in 
(9.52) depends only on the magnitude and not the phase of vp and v,. Treating the alternate 
hypothesis in the same way, and letting f; denote the conditional density given U = (0,a), 


fi(vo, v1) = : exp { [vol fea)? } (9.53) 
(27)? (a?/2 + WNo/2)(WNo/2) 


The log likelihood ratio is then 


LLR(vo, v1) = In {pene | [|vol? om v1] aq 


= ’ 9.54 
fi (vo, v1) (a? + W.No)(WNo) ( ) 
The maximum likelihood (ML) decision rule is therefore to decode U=(a,0) if |vo|?2 > |vi|2 and 
decode U=(0,a) otherwise. Given the symmetry of the problem, this is certainly no surprise. It 
may however be somewhat surprising that this rule does not depend on any possible dependence 
between Gp and G). 

Next consider the ML probability of error. Let Xm = |Vm|? for m = 0,1. The probability 
densities of Xo > 0 and X, > 0, conditioning on U = (a,0) throughout, are then given by 


Pa ee 20 ee ep 
Xo 70) = a2+WNo ~ . a2+WNo , x, 71) = WNo bie WNo ; 


Then, Pr(X] > x) = exp(—7x;) for x 2 0, and therefore 


aS 1 XO XO 
Pra = 2 Uy 
et > Xa) / a2-+WNo exp { anny Sol WN? 2 


cy eee (9.55) 


2 
2+ WN 


Since X; > Xo is the condition for an error when U = (a,0), this is Pr(e) under the hypothesis 
U = (a,0). By symmetry, the error probability is the same under the hypothesis U = (0,a), 
so this is the unconditional probability of error. 


21% and V; are complex random variables, so the probability density of each is defined as probability per unit 
area in the real and complex plane. If Vo and Vj are represented by amplitude and phase, for example, the 
densities are different. 
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The mean signal power is a?/2 since half the inputs have a square value a? and half have value 
0. There are W/2 binary symbols per second, so FE, the energy per bit, is a?/W. Substituting 
this into (9.55), 


1 


P = —__., 
r(e) 2+ E»/No 


(9.56) 


This is a very discouraging result. To get an error probability Pr(e) = 10~° would require 
E,/No ~ 1000 (30 dB). Stupendous amounts of power would be required for more reliable 
communication. 


After some reflection, however, this result is not too surprising. There is a constant signal energy 
FE, per bit, independent of the channel response G',,. The errors generally occur when the sample 
values |gm|? are small; i.e., during fades. Thus the damage here is caused by the combination 
of fading and constant signal power. This result, and the result to follow, make it clear that to 
achieve reliable communication, it is necessary either to have diversity and/or coding between 
faded and unfaded parts of the channel, or to use channel measurement and feedback to control 
the signal power in the presence of fades. 


9.6.2. Non-coherent detection with known channel magnitude 


Consider the same binary pulse position modulation of the previous subsection, but now assume 
that Go and G have the same magnitude, and that the sample value of this magnitude, say g, 
is a fixed parameter that is known at the receiver. The phase ¢,, of Gm, m = 0,1 is uniformly 
distributed over [0, 27) and is unknown at the receiver. The term non-coherent detection is used 
for detection that does not make use of a recovered carrier phase, and thus applies here. We 
will see that the joint density of 9 and ¢; is immaterial. Assume the same noise distribution 
as before. Under hypothesis U=(a,0), the outputs Vo and Vj are given by 


Vo = ag exp{ido} + Zo ; at (under U=(a,0)). (9.57) 
Similarly, under U=(0, a), 
YV=Zo; Vi = agexp{i¢i} + 2 (under U=(0,a)). (9.58) 


Only Vo and Vj, along with the fixed channel magnitude g, can be used in the decision, but it 
will turn out that the value of g is not needed for an ML decision. The channel phases ¢9 and 
g, are not observed and cannot be used in the decision. 


The probability density of a complex random variable is usually expressed as the joint density 
of the real and imaginary parts, but here it is more convenient to use the joint density of 
magnitude and phase. Since the phase ¢o9 of ag exp{ido} is uniformly distributed, and since Zp 
is independent with uniform phase, it follows that Vo has uniform phase; 7.e., Z7Vo is uniform 
conditional on U=(a,0). The magnitude |Vo|, conditional on U=(a,0), is a Rician random 
variable which is independent of ¢9, and therefore also independent of 7Vo. Thus, conditional 
on U=(a,0), Vo has independent phase and amplitude, and uniformly distributed phase. 


Similarly, conditional on U = (0,a), Vo = Zo has independent phase and amplitude, and uni- 
formly distributed phase. What this means is that both the hypothesis and |Vo| are statistically 
independent of the phase ZVo. It can be seen that they are also statistically independent of ¢o. 
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Using the same argument on Vj, we see that both the hypothesis and |Vj| are statistically 
independent of the phases ZV; and ¢}. It should then be clear that |Vo|, [Vi], and the hypothesis 
are independent of the phases (ZVo, ZVi, ¢0, 41). This means that the sample values |vo|? and 
\v1|? are sufficient statistics for choosing between the hypotheses U=(a,0) and U=(0,a). 


Given the sufficient statistics |vo|? and |vi|?, we must determine the ML detection rule, again 
assuming equiprobable hypotheses. Since vp contains the signal under hypothesis U=(a,0), and 
v, contains the signal under hypothesis U = (0, a), and since the problem is symmetric between 
U=(a,0) and U = (0,a), it appears obvious that the ML detection rule is to choose U=(a,0) 
if |vo|? > |v1|? and to choose U = (0,a) otherwise. Unfortunately, to show this analytically, it 
seems necessary to calculate the likelihood ratio. The appendix gives this likelihood ratio and 
calculates the probability of error. The error probability for a given g is derived there as 


1 a*g? 
Pr(é) = 5 exP ( min) : (9.59) 


The mean received baseband signal power is a2g?/2 since only half the inputs are used. There 
are W/2 bits per second, so Ey = a2g?/W. Thus, this probability of error can be expressed as 


Pe ep (-=¢) fuoneoherenn: (9.60) 


It is interesting to compare the performance of this non-coherent detector with that of a coherent 
detector (i.e., a detector such as those in Chapter 8 that use the carrier phase) for equal-energy 
orthogonal signals. As seen before, the error probability in the latter case is 


Pr(e) =Q (VR ie fe exp ( se] (eahenent): (9.61) 


Thus both expressions have the same exponential decay with F,/No and differ only in the 
coefficient. The error probability with non-coherent detection is still substantially higher?” than 
with coherent detection, but the difference is nothing like that in (9.56). More to the point, if 
Ey /No is large, we see that the additional energy per bit required in non-coherent communication 
to make the error probability equal to that of coherent communication is very small. In other 
words, a small increment in dB corresponds to a large decrease in error probability. Of course, 
with non-coherent detection, we also pay a 3 dB penalty for not being able to use antipodal 
signaling. 


Early telephone-line modems (in the 1200 bits per second range) used non-coherent detection, 
but current high-speed wireline modems generally track the carrier phase and use coherent 
detection. Wireless systems are subject to rapid phase changes because of the transmission 
medium, so non-coherent techniques are still common there. 


It is even more interesting to compare the non-coherent result here with the Rayleigh fading 
result. Note that both use the same detection rule, and thus knowledge of the magnitude of the 
channel strength at the receiver in the Rayleigh case would not reduce the error probability. As 
shown in Exercise 9.11, if we regard g as a sample value of a random variable that is known at 


22 As an example, achieving Pr(e) = 10~° with non-coherent detection requires E,/No to be 26.24, which would 
yield Pr(e) = 1.6 x 107" with coherent detection. However, it would require only about half a dB of additional 
power to achieve that lower error probability with non-coherent detection. 
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the receiver, and average over the result in (9.59), then the error probability is the same as that 
in (9.56). 

The conclusion from this comparison is that the real problem with binary communication over 
flat Rayleigh fading is that when the signal is badly faded, there is little hope for successful 
transmission using a fixed amount of signal energy. It has just been seen that knowledge of the 
fading amplitude at the receiver does not help. Also, as seen in the second part of Exercise 
9.11, using power control at the transmitter to maintain a fixed error probability for binary 
communication leads to infinite average transmission power. The only hope, then, is either to 
use variable rate transmission or to use coding and/or diversity. In this latter case, knowledge of 
the fading magnitude will be helpful at the receiver in knowing how to weight different outputs 
in making a block decision. 


Finally, consider the use of only Vo and Vj in binary detection for Rayleigh fading and non- 
coherent detection. If there are no inputs other than the binary input at times 0 and 1, then 
all other outputs can be seen to be independent of the hypothesis and of Vo and V;. If there 
are other inputs, however, the resulting outputs can be used to measure both the phase and 
amplitude of the channel taps. 


The results in the previous two sections apply to any pair of equal energy baseband signals that 
are orthogonal as complex waveforms (i.e., the real and imaginary parts of one waveform are 
orthogonal to both the real and imaginary parts of the other waveform). For this more general 
result, however, we must assume that G',, is constant over the range of m used by the signals. 


9.6.3 Non-coherent detection in flat Rician fading 


Flat Rician fading occurs when the channel can be represented by a single tap and one path 
is significantly stronger than the other paths. This is a reasonable model when a line of sight 
path exists between transmitter and receiver, accompanied by various reflected paths. Perhaps 
more important, this model provides a convenient middle ground between a large number of 
weak paths, modeled by Rayleigh fading, and a single path with random phase, modeled in the 
last subsection. The error probability is easy to calculate in the Rician case, and contains the 
Rayleigh case and known magnitude case as special cases. When we study diversity, the Rician 
model provides additional insight into the benefits of diversity. 


As with Rayleigh fading, consider binary pulse position modulation where U = u° = (a,0) 
under one hypothesis and U = u! = (0,a) under the other hypothesis. The corresponding 
outputs are then 


Vo = UpGo + Zo and Vj =U,G,+ Zj. 


Using non-coherent detection, ML detection is the same for Rayleigh, Rician, or deterministic 
channels, i7.e., given sample values vp and v; at the receiver, 


U=u 


S 
jwol? = vi? (9.62) 


< 


U=u! 


The magnitude of the strong path is denoted by g and the collective variance of the weaker 
paths is denoted by a. Since only the magnitude of vg and v; are used in detection, the phase 
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of the tap gains Go and G, do not affect the decision, so the tap gains can be modeled as 
Go ~ Gi ~ CN(G, o7). This is explained more fully, for the known magnitude case, in the 
appendix. 


From the symmetry between the two hypotheses, the error probability is clearly the same for 
both. Thus the error probability will be calculated conditional on U = u®. All of the following 
probabilities and probability densities are assumed to be conditional on U = u®. Under this 
conditioning, the real and imaginary parts of Vo and V, are independent and characterized by 


Vo,re id N (ag, a) Vo,im oa N (0, a6) 
Vie ~ N(0, 07) Vi im ~ N(0, 07), 
where 
WN + a2c? WN 
of = ——t = (9.63) 


Observe that |Vi|? is an exponentially distributed rv and for any z > 0, Pr(|Vil? > x) = 
exp(—2/207). Thus the probability of error, conditional on |Vo|? = x, is exp(—x/20?). The 
unconditional probability of error (still conditioning on U = u°) can then be found by averaging 
over Vo. 


= 2 2 2 
a SS eS (vore — 09)? UO.im re + UOim 
Pr(e) = doe oP 5 eRe pa a 
—~co J—co 2705 205 205 207 


Integrating this over v9 ,im, 


dv0,re dvo.im 


—\2 2 

2ro20? 22, 1 (vo,re — a9) UO,re d 

of + a? Ino 202 Qoe | OVOre 
0 1 J—co 0 0 1 


This can be integrated by completing the square in the exponent, resulting in 
2 


O71 exp | ag” | 
on tor 7 2(o2 + 07) 


Substituting the values for op and o; from (9.63), the result is 


2 


1 goa 
Pr(e) = pm) exp 22 
24 awe 2W No + a*og 


Finally, the channel gain should be normalized so that g? + o; = 1. Then EF, becomes a?/W 
and 


=2 
g Ey 
Pr(e) = exp | | 
( ) 24 ae 2No + Exo? 


(9.64) 
In the Rayleigh fading case, g = 0 and o7 = 1, simplifying Pr(e) to IEE:No agreeing with 
the result derived earlier. For the fixed amplitude case, g = 1 and o5 = 0), reducing Pr(e) to 
5 exp(—Ey/2No), again agreeing with the earlier result. 
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It is important to realize that this result does not depend on the receiver knowing that a strong 
path exists, since the detection rule is the same for non-coherent detection whether the fading is 
Rayleigh, Rician, or deterministic. The result says that with Rician fading, the error probability 
can be much smaller than with Rayleigh. However, if Oo; > 0, the exponent approaches a 
constant with increasing E}, and Pr(e) still goes to zero with (E,/No)~'. What this says, then, 
is that this slow approach to zero error probability with increasing E, can not be avoided by 
a strong specular path, but only by the lack of an arbitrarily large number of arbitrarily weak 


paths. This is discussed further when we discuss diversity. 


9.7 Channel measurement 


This section introduces the topic of dynamically measuring the taps in the discrete-time baseband 
model of a wireless channel. Such measurements are made at the receiver based on the received 
waveform. They can be used to improve the detection of the received data, and, by sending the 
measurements back to the transmitter, to help in power and rate control at the transmitter. 


One approach to channel measurement is to allocate a certain portion of each transmitted packet 
for that purpose. During this period, a known probing sequence is transmitted and the receiver 
uses this known sequence either to estimate the current values for the taps in the discrete-time 
baseband model of the channel or to measure the actual paths in a continuous-time baseband 
model. Assuming that the actual values for these taps or paths do not change rapidly, these 
estimated values can then help in detecting the remainder of the packet. 


Another technique for channel measurement is called a rake receiver. Here the detection of the 
data and the estimation of the channel are done together. For each received data symbol, the 
symbol is detected using the previous estimate of the channel and then the channel estimate is 
updated for use on the next data symbol. 


Before studying these measurement techniques, it will be helpful to understand how such mea- 
surements will help in detection. In studying binary detection for flat-fading Rayleigh channels, 
we saw that the error probability is very high in periods of deep fading, and that these periods 
are frequent enough to make the overall error probability large even when F,/No is large. In 
studying non-coherent detection, we found that the ML detector does not use its knowledge of 
the channel strength, and thus, for binary detection in flat Rayleigh fading, knowledge at the 
receiver of the channel strength is not helpful. Finally, we saw that when the channel is good 
(the instantaneous E,/No is high), knowing the phase at the receiver is of only limited benefit. 


It turns out, however, that binary detection on a flat-fading channel is very much a special case, 
and that channel measurement can be very helpful at the receiver both for non-flat fading and 
for larger signal sets such as coded systems. Essentially, when the receiver observation consists 
of many degrees of freedom, knowledge of the channel helps the detector weight these degrees 
of freedom appropriately. 


Feeding channel measurement information back to the transmitter can be helpful in general, 
even in the case of binary transmission in flat fading. The transmitter can then send more 
power when the channel is poor, thus maintaining a constant error probability,?? or can send 
at higher rates when the channel is good. The typical round trip delay from transmitter to 


3B xercise 9.11 shows that this leads to infinite expected power on a pure flat-fading Rayleigh channel, but in 
practice the very deep fades that require extreme instantaneous power simply lead to outages. 
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receiver in cellular systems is usually on the order of a few microseconds or less, whereas typical 
coherence times are on the order of 100 msec. or more. Thus feedback control can be exercised 
within the interval over which a channel is relatively constant. 


9.7.1 The use of probing signals to estimate the channel 


Consider a discrete-time baseband channel model in which the channel, at any given output time 
m, is represented by a given number ko of randomly varying taps, Gy,,,,-+- 5 Og ima We will 
study the estimation of these taps by the transmission of a probing signal consisting of a known 
string of input signals. The receiver, knowing the transmitted signals, estimates the channel 


taps. This procedure has to be repeated at least once for each coherence-time interval. 


One simple (but not very good) choice for such a known signal is to use an input of maximum 
amplitude, say a, at a given epoch, say epoch 0, followed by zero inputs for the next kg—1 
epochs. The received sequence over the corresponding kg epochs in the absence of noise is then 
(Gn OG ie GOD cos) In the presence of sample values zo, z;... of complex discrete-time 
WGN, the output v = (vo,... , Uko—1)' from time 0 to kg—1 is then 


= T 
Ui (49o,0+20, AG, 14121, +++ 5 89 x9 —1,k9-1 + 20-1) : 


A reasonable estimate of the kth channel tap, 0 < k < kp — 1 is then 


& Uk 
Deg ae (9.65) 


The principles of estimation are quite similar to those of detection, but are not essential here. In 
detection, an observation (a sample value v of a random variable or vector V) is used to select 
a choice w from the possible sample values of a discrete random variable U (the hypothesis). In 
estimation, a sample value v of V is used to select a choice g from the possible sample values 
of a continuous rv G. In both cases, the likelihoods fyjy(v|u) or fyjqg(ulg) are assumed to be 
known and the a priori probabilities py(u) or f¢(g) are assumed to be known. 


Estimation, like detection, is concerned with determining and implementing reasonable rules for 
estimating g from v. A widely used rule is the maximum likelihood (ML) rule. This chooses 
the estimate g to be the value of g that maximizes fy\q(v|g). The ML rule for estimation is the 
same as the ML rule for detection. Note that the estimate in (9.65) is a ML estimate. 


Another widely used estimation rule is minimum mean square error (MMSE) estimation. The 
MMSE rule chooses the estimate g to be the mean of the a posteriori probability density fav (glv) 
for the given observation v. In many cases, such as where G and V are jointly Gaussian, this 
mean is the same as the value of g which maximizes fgy(g|v). Thus the MMSE rule is somewhat 
similar to the MAP rule of detection theory. 


For detection problems, the ML rule is usually chosen when the a priori probabilities are all the 
same, and in this case ML and MAP are equivalent. For estimation problems, ML is more often 
chosen when the a priori probability density is unknown. When the a priori density is known, 
the MMSE rule typically has a strictly smaller mean square estimation error than the ML rule. 


For the situation at hand, there is usually very little basis for assuming any given model for 
the channel taps (although Rayleigh and Rician models are frequently used in order to have 
something specific to discuss). Thus the ML estimate makes considerable sense and is commonly 
used. Since the channel changes very slowly with time, it is reasonable to assume that the 
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measurement in (9.65) can be used at any time within a given coherence interval. It is also 
possible to repeat the above procedure several times within one coherence interval. The multiple 
measurements of each channel filter tap can then be averaged (corresponding to ML estimation 
based on the multiple observations). 


The problem with the single pulse approach above is that a peak constraint usually exists on the 
input sequence; this is imposed both to avoid excessive interference to other channels and also 
to simplify implementation. If the square of this peak constraint is little more than the energy 
constraint per symbol, then a long input sequence with equal energy in each symbol will allow 
much more signal energy to be used in the measurement process than the single pulse approach. 
As seen in what follows, this approach will then yield more accurate estimates of the channel 
response than the single pulse approach. 


Using a predetermined antipodal pseudo-noise (PN) input sequence u = (u1,... ,Un)' is a good 
way to perform channel measurements with such evenly distributed energy.?4 The components 
U1,-+--,Un Of u are selected to be +a and the desired property is that the covariance function 
of w approximates an impulse. That is, the sequence is chosen to satisfy 


n 2 
a‘n ; k=O 
) tinlinik { 0: kx0 = a’ndz, (9.66) 
m=1 , 


where u,,, is taken to be 0 outside of [1, n]. For long PN sequences, the error in this approximation 
can be viewed as additional but negligible noise. The implementation of such vectors (in binary 
rather than antipodal form) is discussed at the end of this subsection. 


An almost obvious variation on choosing u to be an antipodal PN sequence is to choose it to 
be complex with antipodal real and imaginary parts, 7.e., to be a 4-QAM sequence. Choos- 
ing the real and imaginary parts to be antipodal PN sequences and also to be approximately 
uncorrelated, (9.66) becomes 


n 
Sat ee ~~ Qa?ndx. (9.67) 


m=1 


The QAM form spreads the input measurement energy over twice as many degrees of freedom 
for the given n time units, and is thus usually advantageous. Both the antipodal and the 4-QAM 
form, as well as the binary version of the the antipodal form, are referred to as PN sequences. 
The QAM form is assumed in what follows, but the only difference between (9.66) and (9.67) 
is the factor of 2 in the covariance. It is also assumed for simplicity that (9.66) is satisfied with 
equality. 

The condition (9.67) (with equality) states that u is orthogonal to each of its time shifts. This 
condition can also be expressed by defining the matched filter sequence for u as the sequence ut 
where ul =u" ;. That is, ui is the complex conjugate of wu reversed in time. The convolution 


of u with ul is then wx ul => Um ttl, The covariance condition in (9.67) (with equality) 
is then equivalent to the convolution condition, 


n n 
uxul = SS Urn tth_ on = Ss" UmU*,_, = 20a°ndp. (9.68) 
m=1 


m=1 


24This approach might appear to be an unimportant detail here, but it becomes more important for the rake 
receiver to be discussed shortly. 


Cite as: Robert Gallager, course materials for 6.450 Principles of Digital Communications |, Fall 2006. MIT OpenCourseWare 
(http: //ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 


9.7. CHANNEL MEASUREMENT 341 


Let the complex-valued rv Gj, be the value of the kth channel tap at time m. The channel 
output at time m for the input sequence wu (before adding noise) is the convolution 


n—-1 
Vic SS Gini (9.69) 
k=0 


Since wu is zero outside of the interval [1,n], the noise-free output sequence V’ is zero outside 
of [1,n+ko—1]. Assuming that the channel is random but unchanging during this interval, the 
kth tap can be expressed as the complex rv G;,. Correlating the channel output with uj,--- , uz, 
results in the covariance at each epoch 7 given by 


—j+n —jtn n-l1 
m=—j+1 m=—j+1k=0 
n-1 
= Ss" G,(2a?n)bj44 = 2a°nG_;. (9.71) 
k=0 


Thus the result of correlation, in the absence of noise, is the set of channel filter taps, scaled 
and reversed in time. 


It is easier to understand this by looking at the convolution of V’ with u’. That is, 
Vix ul =(ux G) x ul =(uxul) « G=202nG. 


This uses the fact that convolution of sequences (just like convolution of functions) is both 
associative and commutative. Note that the result of convolution with the matched filter is 
the time reversal of the result of correlation, and is thus simply a scaled replica of the channel 
taps. Finally note that the matched filter u! is zero outside of the interval [—n,—1]. Thus if 
we visualize implementing the measurement of the channel using such a discrete filter, we are 
assuming (conceptually) that the receiver time reference lags the transmitter time reference by 
at least n epochs. 


With the addition of noise, the overall output is V = V’+ Z, i.e., the output at epoch m is 
Vn = Vi.+Zm. Thus the convolution of the noisy channel output with the matched filter ul is 
given by 


Veul=Vixul+Zxul =20?nG+Zxul. (9.72) 


After dividing by 2a?n, the kth component of this vector equation is 


1 
2a?n 


So Vint im = Gr + Ue, (9.73) 


where W;, is defined as the complex random variable 


1 
= t 
Vv, = aan > LZmUp_ an" (9.74) 


This estimation procedure is illustrated in Figure 9.9. 
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1 
Z 2a2n 


y’ V 2 
u—+4 G ul hel pees 


Figure 9.9: Illustration of channel measurement using a filter matched to a PN input. We 
have assumed that G is nonzero only in the interval [0, ko—1] so the output is observed only 
in this interval. Note that the component G in the output is the response of the matched 
filter to the input wu, whereas W is the response to Z. 


Assume that the channel noise is white Gaussian noise so that the discrete-time noise variables 
{Zm} are circularly symmetric CN’(0,WNo) and iid, where W/2 is the baseband bandwidth?°. 
Since u is orthogonal to each of its time shifts, its matched filter vector u! must have the same 
property. It then follows that 


NoW 


ok 1 ok 
EWU] = — So EZ Jut_,,, (ution) 
datn = 


The random variables {V;,} are jointly Gaussian from (9.74) and uncorrelated from (9.75), so 
they are independent Gaussian rv’s. It is a simple additional exercise to show that each Wy is 


circularly symmetric, i.e., Vy ~ CN (0, Now). 


Going back to (9.73), it can be seen that for each k, 0 < k < ko—1, the ML estimate of G;, from 
the observation of G, + W, is given by 


a 1 
ae, T 
G, = Da2n » VinUp_m 


It can also be shown that this is the ML estimate of G, from the entire observation V, but 
deriving this would take us too far afield. From (9.73), the error in this estimate is U;,, so the 
mean squared error in the real part of this estimate, and similarly in the imaginary part, is given 
by WNo/(4a2n). 

By increasing the measurement length n or by increasing the input magnitude a, we can make 
the estimate arbitrarily good. Note that the mean squared error is independent of the fading 
variables {G,}; the noise in the estimate does not depend on how good or bad the channel is. 
Finally observe that the energy in the entire measurement signal is 2a?nW, so the mean squared 
error is inversely proportional to the measurement-signal energy. 


What is the duration over which a channel measurement is valid? Fortunately, for most wireless 
applications, the coherence time 7.., is many times larger than the delay spread, typically on 
the order of hundreds of times larger. This means that it is feasible to measure the channel and 
then use those measurements for an appreciable number of data symbols. There is, of course, 
a tradeoff, since using a long measurement period n, leads to an accurate measurement, but 
uses an appreciable part of Jo, for measurement rather than data. This tradeoff becomes less 
critical as the coherence time increases. 


One clever technique that can be used to increase the number of data symbols covered by one 
measurement interval is to do the measurement in the middle of a data frame. It is also possible, 


?5Recall that these noise variables are samples of white noise filtered to W/2. Thus their mean square value 
(including both real and imaginary parts) is equal to the bandlimited noise power NoW. Viewed alternatively, the 
sinc functions in the orthogonal expansion have energy 1/W so the variance of each real and imaginary coefficient 
in the noise expansion must be scaled up by W from the noise energy No/2 per degree of freedom. 
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for a given data symbol, to interpolate between the previous and the next channel measurement. 
These techniques are used in the popular GSM cellular standard. These techniques appear to 
increase delay slightly, since the early data in the frame cannot be detected until after the 
measurement is made. However, if coding is used, this delay is necessary in any case. We have 
also seen that one of the primary purposes of measurement is for power/rate control, and this 
clearly cannot be exercised until after the measurement is made. 


The above measurement technique rests on the existence of PN sequences which approximate 
the correlation property in (9.67). PN sequences (in binary form) are generated by a procedure 
very similar to that by which output streams are generated in a convolutional encoder. In a 
convolutional encoder of constraint length n, each bit in a given output stream is the mod-2 sum 
of the current input and some particular pattern of the previous n inputs. Here there are no 
inputs, but instead, the output of the shift register is fed back to the input as shown in Figure 


9.10. 
C)- 


Dr Dp-1 >) De-2 > Dr-3 sD e-4 


Figure 9.10: A maximal-length shift register with n = 4 stages and a cycle of length 2” — 1 
that cycles through all states except the all 0 state. 


By choosing the stages that are summed mod 2 in an appropriate way (denoted a mazimal-length 
shift register), any non-zero initial state will cycle through all possible 2” — 1 non-zero states 
before returning to the initial state. It is known that maximal-length shift registers exist for all 
positive integers n. 


One of the nice properties of a maximal-length shift register is that it is linear (over mod-2 
addition and multiplication). That is, let y be the sequence of length 2” — 1 bits generated by 
the initial state x, and let y’ be that generated by the initial state x’. Then it can be seen with 
a little thought that y @ y’ is generated by 2 @ x’. Thus the difference between any two such 
cycles started in different initial states contains 2”~! ones and 2”~! — 1 zeros. In other words, 
the set of cycles forms a binary simplex code. 


It can be seen that any nonzero cycle of a maximal length shift register has an almost ideal 
correlation with a cyclic shift of itself. Here, however, it is the correlation over a single period, 
where the shifted sequence is set to zero outside of the period, that is important. There is no 
guarantee that such a correlation is close to ideal, although these shift register sequences are 
usually used in practice to approximate the ideal. 


9.7.2. Rake receivers 


A Rake receiver is a type of receiver that combines channel measurement with data reception 
in an iterative way. It is primarily applicable to spread spectrum systems in which the input 
signals are pseudo-noise (PN) sequences. It is, in fact, just an extension of the pseudo-noise 
measurement technique described in the previous subsection. Before describing the rake receiver, 
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it will be helpful to review binary detection, assuming that the channel is perfectly known and 
unchanging over the duration of the signal. 


Let the input U be one of the two signals u° = (u,--- ,u2)™ and ul = (u},--» , ub)". Denote 


the known channel taps as g = (go,°** ; 9k 9-1)". Then the channel output, before the addition 
of white noise, is either u? « g which we denote by bo, or u! * g, which we denote by 04. 
These convolutions are contained within the interval [1,n+ko—1]. After the addition of WGN, 
the output is either V = b9 + Z or V = bi + Z. The detection problem is to decide, from 
observation of V, which of these two possibilities is more likely. The LLR for this detection 
problem is shown in Section 8.3.4 to be given by (8.26), repeated below, 


—||v — boll? + |lv — bal? 
No 
2K((v, bo)) — 2((v, b1)) — || boll? + |] ball? 
No 


LLR(v) 


(9.76) 


It is shown in Exercise 9.17 that if w° and wu! are ideal PN sequences, i.e., sequences that satisfy 
(9.68), then ||bo||? = ||bi||?. The ML test then simplifies to 


R((v, u? * g)) : R((v, ul * g)). (9.77) 


Finally, for i = 0,1, the inner product (v, u* * g) is simply the output at epoch 0 when v is 
the input to a filter matched to u’ * g. The filter matched to u’ *« g, however, is just the filter 
matched to u’ convolved with the filter matched to g. The block diagram for this is shown in 


Figure 9.11. 
WN a Gey) |) 8 aoe 
b b « t > 
g Uae eye bi Decision 
u? we 0yt t va 
(u)' -— g 


“ t > 

Figure 9.11: Detection for binary signals passed through a Proven filter g. The real parts of 

the inputs entering the decision box at epoch 0 are compared. U=u" if the real part of the 

lower input is larger, and U = u! is chosen otherwise. 
If the signals above are PN sequences, there is a great similarity between figures 9.9 and 9.11. 
In particular, if u° is sent, then the output of the matched filter (w°)', i.e., the first part of the 
lower matched filter, will be 2a?ng in the absence of noise. Note that g is a vector, meaning 
that the noise-free output at epoch k is 2a?ng;, Similarly, if u! is sent, then the noise-free output 
of the first part of the upper matched filter, at epoch k, will be a?ng,. The decision is made 
at receiver time 0 after the sequence 2a?ng, along with noise, passes through the unrealizable 
filter g'. These unrealizable filters are made realizable by the delay in receiver timing relative 
to transmitter timing. 
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Under the assumption that a correct decision is made, an estimate can also be made of the 
channel filter g. In particular, if the decision is U=w°, then the outputs of the first part of the 
lower matched filter, at receiver times —kg + 1 to 0, will be scaled noisy versions of go to Gko—1- 
Instead of using these outputs as a ML estimate of the filter taps, they must be combined with 
earlier estimates, constantly updating the current estimate each n epochs. This means that if 
the coherence time is long, then the filter taps will change very slowly in time, and the continuing 
set of channel estimates, one each n sample times, can be used to continually improve and track 
the channel filter taps. 


Note that the decision in Figure 9.11 was based on knowledge of g and thus knowledge of the 
matched filter g'. The ability to estimate g as part of the data detection thus allows us to 
improve the estimate gt at the same time as making data decisions. When U = u’ (and the 
decision is correct), the outputs of the matched filter (u’)' provide an estimate of g, and thus 
allow gi to be updated. The combined structure for making decisions and estimating the channel 
is called a rake receiver and is illustrated in Figure 9.12. 


9 Decision 


e Z (ut)t + gi 
*g ‘ neo 
oe 


U (ui | >| gi 


Estimate g ~——_ 


Figure 9.12: Rake Receiver. If U=w°, then the corresponding kp outputs from the matched 
filter (w°)' is used to update the estimate of g (and thus the taps of each matched filter 
g'). Alternatively, if U = u', then the output from the matched filter (w!)* is used. These 
updated matched filters g* are then used, with the next block of outputs from (w°)! and (u!)f 
to make the next decision, and so forth for subsequent estimates and decisions. 


The rake receiver structure can only be expected to work well if the coherence time of the channel 
includes many decision points. That is, the updated channel estimate made on one decision can 
only be used on subsequent decisions. Since the channel estimates made at each decision epoch 
are noisy, and since the channel changes very slowly, the estimate g made at one decision epoch 
will only be used to make a small change to the existing estimate. 


A rough idea of the variance in the estimate of each tap gx can be made by continuing to 
assume that decisions are made correctly. Assuming as before that the terms in the input PN 
sequences have magnitude a, it can be seen from (9.75) that for each signaling interval of n 
samples, the variance of the measurement noise (in each of the real and imaginary directions) is 
WNo/(4a?n). There are roughly 7.,W/n signaling intervals in a coherence-time interval, and 
we can approximate the estimate of gz; as the average of those measurements. This reduces the 
measurement noise by a factor of T.o,W/n, reducing the variance of the measurement error”© to 


26The fact that the variance of the measurement error does not depend on W might be surprising. The estimation 
error per discrete epoch 1/W is WNo/(4a?Zccn), which increases with W, but the number of measurements per 
second increases in the same way, leading to no overall variation with W. Since the number of taps is increasing 
with W, however, the effect of estimation errors increases with W. However, this assumes a model in which there 
are many paths with propagation delays within 1/W of each other, and this is probably a poor assumption when 
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No/(4a?Teon)- 


An obvious question, however, is the effect of decision errors. Each decision error generates an 
“estimate” of each g;, that is independent of the true g,. Clearly, too many decision errors will 
degrade the estimated value of each g,;, which in turn will further degrade the decision errors 
until both estimations and decisions are worthless. Thus a rake receiver requires an initial good 
estimate of each g;, and also requires some mechanism for recovering from the above catastrophe. 


Rake receivers are often used with larger alphabets of input PN sequences, and the analysis of 
such non-binary systems is the same as for the binary case above. For example, the IS95 cellular 
standard to be discussed later uses spread spectrum techniques with a bandwidth of 1.25 MH. In 
this system, a signal set of 64 orthogonal signal waveforms are used with a 64-ary rake receiver. 
In that example, however, the rake receiver uses non-coherent techniques. 


Usually, in a rake system, the PN sequences are chosen to be mutually orthogonal, but this is not 
really necessary. So long as each signal is a PN sequence with the appropriate autocorrelation 
properties, the channel estimation will work as before. The decision element for the data, of 
course, must be designed for the particular signal structure. For example, we could even use 
binary antipodal signaling, given some procedure to detect if the channel estimates become 
inverted. 


9.8 Diversity 


Diversity has been mentioned several times in the previous sections as a way to reduce error 
probabilities at the receiver. Diversity refers to a rather broad set of techniques, and the model 
of the last two sections must be generalized somewhat. 


The first part of this generalization is to represent the baseband modulated waveform as an 
orthonormal expansion u(t) = >>, uxgx(t) rather than the sinc expansion of the last two sections. 
For the QAM type systems in the last two sections, this is a somewhat trivial change. The 
modulation pulse sinc(Wt) is normalized to W~!/?sinc(Wt). With this normalization, the noise 
sequence Z1, Z,... becomes Z, ~ CN(0, No) for k € Z* and the antipodal input signal ta 
satisfies a? = E). 


Before discussing other changes in the model, we give a very simple example of diversity using 
the tapped gain model of Section 9.5. 


Example 9.8.1. Consider a Rayleigh fading channel modeled as a two-tap discrete-time base- 
band model. The input is a discrete time sequence U,,, and the output is a discrete time complex 
sequence described, as illustrated below, by 


Vin = GomUm —T Gi mUm-1 =F Zm- 


For each m, Go,m and Gj are iid and circularly symmetric complex Gaussian rv’s with Gom ~ 
CN(0,1/2). This satisfies the condition 57>, E[|G,|?] = 1 given in (9.48). The correlation of Go,m 
and Gj m with m is immaterial, and can be assumed uncorrelated. Assume that the sequence 
Zm is a sequence of iid circularly symmetric rv’s, Zm ~ CN(0, No). 


Assume that a single binary digit is sent over this channel, sending either u° = (Ey, 0, 0,0) or 
u' = (0,0, /E,0), each with equal probability. The input for the first hypothesis is at epoch 


W is large. 
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Input__,| Unn hh A ot) 


Figure 9.13: Two-tap discrete-time Rayleigh fading model 


O and for the second hypothesis at epoch 2, thus allowing a separation between the responses 
from the two hypotheses. 


Conditional on U = u®, it can be seen that Vo ~ CN'(0, E,/2+No), where the signal contribution 
to Vo comes through the first tap. Similarly, V; ~ CN’(0, Ey /2+ No), with the signal contribution 
coming through the second tap. Given U = u°, V2 ~ CN (0, No) and V3 ~ CN(0, No). Since the 
noise variables and the two gains are independent, it can be seen that Vo,... , V3 are independent 
conditional on U = u?. The reverse situation occurs for U = ut, with Vin ~ CN (0, Ey/2 + No) 
for m = 2,3 and Vin, ~ CN (0, No) for m = 0,1. 


Since ZV, for 0 < m < 3 are independent of the hypothesis, it can be seen the energy in the set 
of received components, X;, = (Vial? 0 <m <3 forms a sufficient statistic. Under hypothesis 
u°, Xo and X, are exponential rv’s with mean E;,/2+ No and X2 and X3 are exponential with 
mean No; all are independent. Thus the probability density of Xp and X, (given u°) are given 
by ae~°” for « > 0 where a = Note" Similarly, the probability density of X2 and X3 are 
given by Ge~%* for x > 0 where 6 = Ny The reverse occurs under hypothesis u!. 
The LLR and the probability of error (under ML detection) are then evaluated in Exercise 9.13 
to be 


LLR(x) = (6 — a)(x9+21—22—23). 


Pr(e) 3078 + a3 4+ orb 
r(e) = — 

a+ 3) E 

) (2+ Fe) 


Note that as E,/No becomes large, the error probability approaches 0 as (E,/No)~? instead of 
(E,/No)~!, as with flat Raleigh fading. This is a good example of diversity; errors are caused 
by high fading levels, but with two independent taps, there is a much higher probability that 
one or the other has reasonable strength. 


Note that multiple physical transmission paths give rise both to multipath fading and to diver- 
sity; the first usually causes difficulties and the second usually ameliorates those difficulties. It 
is important to understand what the difference is between them. 


If the input bandwidth is chosen to be half as large as in the example above, then the two-tap 
model would essentially become a one-tap model; this would lead to flat Rayleigh fading and no 
diversity. The major difference is that with the two tap model, the path outputs are separated 
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into two groups and the effect of each can be observed separately. With the one tap model, the 
paths are all combined, since there are no longer independently observable sets of paths. 


It is also interesting to compare the diversity receiver above with a receiver that could make use 
of channel measurements. If the tap values were known, then an ML detector would involve a 
matched filter on the channel taps, as in Figure 9.12. In terms of the particular input in the above 
exercise, this would weight the outputs from the two channel taps according to the magnitude of 
the tap, whereas the diversity receiver above weights them equally. In other words, the diversity 
detector above doesn’t do quite the right thing given known tap values, but it certainly is a 
large improvement over narrow band transmission. 


The type of diversity used above is called time diversity since it makes use of the delay between 
different sets of paths. The analysis above hides a major part of the benefit to be gained by time 
diversity. For example, in the familiar reflecting wall example, there are only two paths. If the 
signal bandwidth is large enough that the response comes on different taps (or if the receiver 
measures the time delay on each path), then the fading will be eliminated. 


It appears that many wireless situations, particularly those in cellular and local area networks, 
contain a relatively small number of significant coherent paths, and if the bandwidth is large 
enough to resolve these paths, then the gain is far greater than that indicated in the example 
above. 


The diversity receiver above can be generalized to other discrete models for wireless channels. 
For example, the frequency band could be separated into segments separated by the coherence 
frequency, thus getting roughly independent fading in each and the ability to separate the outputs 
in each of those bands. Diversity in frequency is somewhat different than diversity in time, since 
it doesn’t allow the resolution of paths of different delays. 


Another way to achieve diversity is through multiple antennas at the transmitter and receiver. 
Note that multiple antennas at the receiver allow the full received power available at one antenna 
to be received at each antenna, rather than splitting the power as occurs with time diversity 
or frequency diversity. For all of these more general ways to achieve diversity, the input and 
output should obviously be represented by the appropriate orthonormal expansions to bring out 
the diversity terms. 


The two-tap example above can be easily extended to an arbitrary number of taps. As- 
sume the model of Figure 9.13 modified to have L taps, Gom,-...,G@L-—1,m Satisfying Grm ~ 
CN(0,1/L) for 0 < k < L—1. The input is assumed to be either w? = (/Ep,0,...,0) or 
u! = (0,... ,0,./Ey,0,... ,0), where each of these 2Z-tuples has zeros in all but one position, 
namely position 0 for w° and position L for u!. The energy in the set of received components, 
Xm = |Vm|?, 0 < m < 2L —1, forms a sufficient statistic for the same reason as in the dual di- 
versity case. Under hypothesis u°, Xo,... , Xz—1 are exponential rv’s with density aexp(—az) 
where a = NoLEA/E: Similarly, X;,..., X2p-1 are exponential rv’s with density Gexp(—Gz). 
All are conditionally independent given u°. The reverse is true given hypothesis w!. 


It can be seen that the ML detection rule is to choose u® if eee Kei pase poy BHO 


choose u! otherwise. Exercise 9.14 then shows that the error probability is 


2D-1 
Pr(e)= Do ea ‘oa Sy) 


l=L 
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where p = a/(a+ 2). Substituting in the values for a and (, this becomes 


Pr(e) = se oe ? ‘ ns 


l=L 


(9.78) 


It can be seen that the dominant term in this sum is = L. For any given L, then, the probability 
of error decreases with E, as E, “At the same time, however, if L is increased for a given Ep, 
then eventually the probability of error starts to increase and approaches 1/2 asymptotically. 
In other words, increased diversity can decrease error probability up to a certain point but then 
further increased diversity, for fixed Ey, is counter productive. 


If one evaluates (9.78) as a function of E,/No and L, one finds that Pr(e) is minimized for large 
but fixed E;,/No when L is on the order of 0.3 Ey /No. The minimum is quite broad, but too much 
diversity does not help. The situation remains essentially the same with channel measurement. 
Here the problem is that when the available energy is spread over too many degrees of freedom, 
there is not enough energy per degree of freedom to measure the channel. 


The preceding discussion assumed that each diversity path is Rayleigh, but we have seen that 
with time diversity, the individual paths might become separable, thus allowing much lower error 
probability than if the taps remain Rayleigh. Perhaps at this point, we are trying to model the 
channel too accurately. If a given transmitter and receiver design is to be used over a broad 
set of different channel behaviors, then the important question is the fraction of behaviors over 
which the design works acceptably. This question ultimately must be answered experimentally, 
but simple models such as Rayleigh fading with diversity provide some insight into what to 
expect. 


9.9 CDMA; The IS95 Standard 


In this section, IS95, one of the major classes of cellular standards, is briefly described. This 
system has been selected both because it is conceptually more interesting, and because most 
newer systems are focusing on this approach. This standard uses spread spectrum, which is often 
known by the name CDMA (Code Division Multiple Access). There is no convincing proof that 
spread spectrum is inherently superior to other approaches, but it does have a number of inherent 
engineering advantages over traditional narrow band systems. Our main purpose, however, is 
to get some insight into how a major commercial cellular network system deals with some of 
the issues we have been discussing. The discussion here focuses on the issues arising with voice 
transmission. 


1S95 uses a frequency band from 800 to 900 megahertz (MH). The lower half of this band is used 
for transmission from cell phones to base station (the uplinks), and the upper half is used for base 
station to cell phones (the downlinks). There are multiple subbands?’ within this band, each 
1.25 MH wide. Each base station uses each of these subbands, and multiple cell phones within 
a cell can share the same subband. Each downlink subband is 45 MH above the corresponding 
uplink subband. The transmitted waveforms are sufficiently well filtered at both the cell phones 


27It is common in the cellular literature to use the word channel for a particular frequency subband; we will 
continue to use the word channel for the transmission medium connecting a particular transmitter and receiver. 
Later we use the words multiaccess channel to refer to the uplinks for multiple cell phones in the same cell. 
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and the base stations so that they don’t interfere appreciably with reception on the opposite 
channel. 


The other two major established cellular standards use TDMA (time-division multiple access). 
The subbands are more narrow in TDMA, but only one cell phone uses a subband at a time to 
communicate with a given base station. In TDMA, there is little interference between different 
cell phones in the same cell, but considerable interference between cells. CDMA has more 
interference between cell phones in the same cell, but less between cells. 


A high level block diagram for the parts of a transmitter is given in Figure 9.14. 


— 


Voice Voice Channel 
Waveform Compressor Coder Modulator 


t—> Channel 


Figure 9.14: High Level Block Diagram of Transmitters 


The receiver, at a block level viewpoint (see Figure 9.15), performs the corresponding receiver 
functions in reverse order. This can be viewed as a layered system, although the choice of 
function in each block is somewhat related to that in the other blocks. 


Voice __| Voice Channel 
Waveform Decoder Decoder Demodulator Channel 


Figure 9.15: High Level Block Diagram of Receiver 


These three blocks are described in the following subsections. The voice compression and channel 
coding are quite similar in each of the standards, but the modulation is very different. 


9.9.1 Voice compression 


The voice waveform, in all of these standards, is first segmented into 20 ms. increments. These 
segments are long enough to allow considerable compression, but short enough to cause relatively 
little delay. In IS95, each 20 ms segment is encoded into 172 bits. The digitized voice rate is 
then 8600 = 172/0.02 bits per second (bps). Voice compression has been an active research area 
for many years. In the early days, voice waveforms, which lie in a band from about 400 to 3200 
H, were simply sampled at 8000 times a second, corresponding to a 4 KH band. Each sample 
was then quantized to 8 bits for a total of 64,000 bps. Achieving high quality voice at 8600 bps 
is still a moderate challenge today and requires considerable computation. 


The 172 bits per 20 ms segment from the compressor is then extended by 12 bits per segment 
for error detection. This error detection is unrelated to the error correction algorithms to be 
discussed later, and is simply used to detect when those systems fail to correct the channel 
errors. Each of these 12 bits is a parity check (i.e., a modulo-2 sum) of a prespecified set of the 
data bits. Thus, it is very likely, when the channel decoder fails to decode correctly, that one 
of these parity checks will fail to be satisfied. When such a failure occurs, the corresponding 
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frame is mapped into 20 ms of silence, thus avoiding loud squawking noises under bad channel 
conditions. 


Each segment of 172 + 12 bits is then extended by 8 bits, all set to 0. These bits are used as 
a terminator sequence for the convolutional code to be described shortly. With the addition of 
these bits, each 20 msec segment generates 192 bits, so this overhead converts the rate from 
8600 to 9600 bps. The timing everywhere else in the transmitter and receiver is in multiples of 
this bit rate. In all the standards, many overhead items creep in, each performing small but 
necessary functions, but each increasing the overall transmitted bit rate. 


9.9.2. Channel coding and decoding 


The channel encoding and decoding use a convolutional code and a Viterbi decoder. The convo- 
lutional code has rate 1/3, thus producing three output bits per input bit, and mapping the 9600 
bps input into a 28.8 Kbps output. The choice of rate is not very critical, since it involves how 
much coding is done here and how much is done later as part of the modulation proper. The 
convolutional encoder has a constraint length of 8, so each of the three outputs corresponding 
to a given input depends on the current input plus the eight previous inputs. There are then 
28 — 256 possible states for the encoder, corresponding to the possible sets of values for the 
previous 8 inputs. 


The complexity of the Viterbi algorithm is directly proportional to the number of states, so there 
is a relatively sharp tradeoff between complexity and error probability. The fact that decoding 
errors are caused primarily by more fading than expected (either a very deep fade that cannot 
be compensated by power control or by an inaccurate channel measurement), suggests that 
increasing the constraint length from 8 to 9 would, on the one hand be somewhat ineffective, 
and, on the other hand, double the decoder complexity. 


The convolutional code is terminated at the end of each voice segment, thus turning the con- 
volutional encoder into a block code of block length 576 and rate 1/3, with 192 inputs bits per 
segment. As mentioned in the previous subsection, this 192 bits includes 8 bits to terminate 
the code and return it to state 0. Part of the reason for this termination is the requirement 
for small delay, and part is the desire to prevent a fade in one segment from causing errors in 
multiple voice segments (the failure to decode correctly in one segment makes decoding in the 
next segment less reliable in the absence of this termination). 


When a Viterbi decoder makes an error, it is usually detectable from the likelihood ratios in 
the decoder, so the 12 bit overhead for error detection could probably have been avoided. Many 
such tradeoffs between complexity, performance, and overhead must be made in both standards 
and products. 


The decoding uses soft decisions from the output of the demodulator. The ability to use like- 
lihood information (7.e., soft decisions) from the demodulator is one reason for the use of con- 
volutional codes and Viterbi decoding. Viterbi decoding uses this information in a natural way, 
whereas, for some other coding and decoding techniques, this can be unnatural and difficult. All 
of the major standards use convolutional codes, terminated at the end of each voice segment, 
and decode with the Viterbi algorithm. It is worth noting that channel measurements are useful 
in generating good likelihood inputs to the Viterbi decoder. 


The final step in the encoding process is to interleave the 576 output bits from the encoder 
corresponding to a given voice segment. Correspondingly, the first step in the decoding process 
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is to de-interleave the bits (actually the soft decisions) coming out of the demodulator. It can 
be seen without analysis that if the noise coming into a Viterbi decoder is highly correlated, 
then the Viterbi decoder, with its short constraint length, is more likely to make a decoding 
error than if the noise is independent. The next subsection will show that the noise from the 
demodulator is in fact highly correlated, and thus the interleaving breaks up this correlation. 
Figure 9.16 summarizes this channel encoding process. 


8.6 Kbps 12 bit 8 bit Conv. |_9.6 Kbps 
172 b/seg. Error det. terminator | 192 b/seg. 


Convolutional |__28.8 Kbps __,| Inter- | 28.8 Kbps 
Encoder 576 b/seg. leave | 576 b/seg. 


Figure 9.16: Block diagram of Channel Encoding 


9.9.3 Viterbi decoding for fading channels 


In order to get some sense of why the above convolutional code with Viterbi decoding will not 
work very well if the coding is followed by straight-forward binary modulation, suppose the pulse 
position modulation of Subsection 9.6.1 is used and the channel is represented by a single tap 
with Rayleigh fading. The resulting bandwidth is well within typical values of F.oh, so the single 
tap model is reasonable. The coherence time is typically at least a msec, but in the absence of 
moving vehicles, it could easily be more than 20 msec. 


This means that an entire 20 msec. segment of voice could easily be transmitted during a 
deep fade, and the convolutional encoder, even with interleaving within that 20 msec. would 
not be able to decode successfully. If the fading is much faster, the Viterbi decoder, with 
likelihood information on the incoming bits, would probably work fairly successfully, but that is 
not something that can be relied upon. 


There are only three remedies for this situation. One is to send more power when the channel is 
faded. As shown in Exercise 9.11, however, if the input power compensates completely for the 
fading (i.e., the input power at time m is 1/|gm|?), then the expected input power is infinite. 
This means that, with finite average power, deep fades for prolonged periods cause outages. 


The second remedy is diversity, in which each codeword is spread over enough coherence band- 
widths or coherence-time intervals to achieve averaging over the channel fades. Using diversity 
over several coherence-time intervals causes delays proportional to the coherence time, which is 
usually unacceptable for voice. Diversity can be employed by using a bandwidth larger than 
the coherence frequency (this can be done using multiple taps in the tapped delay line model or 
multiple frequency bands). 


The third remedy is the use of variable rate transmission. This is not traditional for voice, since 
the voice encoding traditionally produces a constant rate stream of input bits into the channel, 
and the delay constraint is too stringent to queue this input and transmit it when the channel 
is good. It would be possible to violate the source/channel separation principle and have the 
source produce “important bits” at one rate and “unimportant bits” at another rate. Then 
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when the channel is poor, only the important bits would be transmitted. Some cellular systems, 
particularly newer ones, have features resembling this. 


For data, however, variable rate transmission is very much a possibility since there is usually not 
a stringent delay requirement. Thus, data can be transmitted at high rate when the channel is 
good and at low rate or zero rate when the channel is poor. Newer systems also take advantage 
of this possibility. 


9.9.4 Modulation and demodulation 


The final part of the high level block diagram of the IS95 transmitter is to modulate the output 
of the interleaver before channel transmission. This is where spread spectrum comes in, since 
this 28.8 Kbps data stream is now spread into a 1.25 MH bandwidth. The bandwidth of the 
corresponding received spread waveform will often be broader than the coherence frequency, 
thus providing diversity protection against deep fades. A rake receiver will take advantage of 
this diversity. Before elaborating further on these diversity advantages, the mechanics of the 
spreading is described. 


The first step of the modulation is to segment the interleaver output into strings of length 6, 
and then map each successive 6-bit string into a 64-bit binary string. The mapping maps each 
of the 64 strings of length 6 into the corresponding row of the Hg Hadamard matrix described 
in Section 8.6.1. Each row of this Hadamard matrix differs from each other row in 32 places 
and each row, except the all zero row, contains exactly 32 ones and 32 zeros. It is thus a binary 
orthogonal code. 


Suppose the selected word from this code is mapped into a PAM sequence by the 2-PAM map 
{0,1} —> {+a,—a}. These 64 sequences of binary antipodal values are called Walsh functions. 
The symbol rate coming out of this 6 bit to 64 bit mapping is (64/6) - 28, 800 = 307, 200 symbols 
per second. 


To get some idea of why these Walsh functions are used, let x,... hy be the kt* Walsh 
function, amplified by a factor a, and consider this as a discrete-time baseband input. For 
simplicity, assume flat fading with a single channel tap of amplitude g. Suppose that baseband 
WGN of variance No/2 (per real and imaginary part) is added to this sequence, and consider 
detecting which of the 64 Walsh functions was transmitted. Let E, be the expected received 
energy for each of the Walsh functions. The non-coherent detection result from (9.59) shows 
that the probability that hypothesis 7 is more likely than k, given that x(t) is transmitted, is 
1/2 exp[ sy¢l- Using the union bound over the 63 possible incorrect hypotheses, the probability 
of error, using non-coherent detection and assuming a single tap channel filter, is 


Pr(e) < — exp =] ; (9.79) 


The probability of error is not the main subject of interest here, since the detector output is 
soft decisions that are then used by the Viterbi decoder. However, the error probability lets us 
understand the rationale for using such a large signal set with orthogonal signals. 

If coherent detection were used, the analogous union bound on error probability would be 
63Q(,/Es/No). As discussed in Section 9.6.2, this goes down exponentially with FE, in the 
same way as (9.79), but the coefficient is considerably smaller. However, the number of addi- 
tional dB required using non-coherent detection to achieve the same Pr(e) as coherent detection 
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decreases almost inversely with the exponent in (9.79). This means that by using a large number 
of orthogonal functions (64 in this case), we make the exponent in (9.79) large in magnitude, 
and thus approach (in dB terms) what could be achieved by coherent detection. 


The argument above is incomplete, because FE, is the transmitted energy per Walsh function. 
However, 6 binary digits are used to select each transmitted Walsh function. Thus, Ey in this 
case is E,/6 and (9.79) becomes 


Pr(e) < 63 exp(—3E,/No). (9.80) 


This large signal set also avoids the 3 dB penalty for orthogonal signaling rather than antipodal 
signaling that we have seen for binary signal sets. Here the cost of orthogonality essentially 
lies in using an orthogonal code rather than the corresponding biorthogonal code with 7 bits of 
input and 128 codewords”®, i.e., a factor of 6/7 in rate. 


A questionable issue here is that two codes (the convolutional code as an outer code, followed 
by the Walsh function code as an inner code) are used in place of a single code. There seems to 
be no clean analytical way of showing that this choice makes good sense over all choices of single 
or combined codes. On the other hand, each code is performing a rather different function. 
The Viterbi decoder is eliminating the errors caused by occasional fades or anomalies, and the 
Walsh functions allow non-coherent detection and also enable a considerable reduction in error 
probability because of the large orthogonal signal sets rather than binary transmission. 


The modulation scheme in IS95 next spreads the above Walsh functions into an even wider 
bandwidth transmitted signal. The stream of binary digits out of the Hadamard encoder”? is 
combined with a pseudo-noise (PN) sequence at a rate of 1228.8 kbps, i.e., four PN bits for each 
signal bit. In essence, each bit of the 307.2 kbps stream out of the Walsh encoder is repeated 
four times (to achieve the 1228.8 kbps rate) and is then added mod-2 to the PN sequence. This 
further spreading provides diversity over the available 1.25 MH bandwidth. 


The constraint length here is n = 42 binary digits, so the period of the cycle is 247 — 1 (about 
41 days). We can ignore the difference between simplex and orthogonal, and simply regard each 
cycle as orthogonal to each other cycle. Since the cycle is so long, however, it is better to simply 
approximate each cycle as a sequence of iid binary digits. There are several other PN sequences 
used in the IS-95 standard, and this one, because of its constraint length, is called the “long PN 
sequence.” PN sequences have many interesting properties, but for us it is enough to view them 
as iid but also known to the receiver. 


The initial state of the long PN sequence is used to distinguish between different cell phones, and 
in fact this initial state is the only part of the transmitter system that is specific to a particular 
cell phone. 


The resulting binary stream, after adding the long PN sequence, is at a rate of 1.2288 Mbps. 
This stream is duplicated into two streams prior to being quadrature modulated onto a cosine 
and sine carrier. The cosine stream is added mod-2 to another PN-sequence (called the in-phase 
or I-PN) sequence at rate 1.2288 Mbps, and the sine stream is added mod-2 to another PN 
sequence called the quadrature or Q-PN sequence. The I-PN and Q-PN sequences are the same 
for all cell phones and help in demodulation. 


*8This biorthogonal code is called a (64,7,32) Reed Muller code in the coding literature 

2°We visualized mapping the Hadamard binary sequences by a 2PAM map into Walsh functions for simplicity. 
For implementation, it is more convenient to maintain binary (0,1) sequences until the final steps in the modulation 
process are completed. 
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The final part of modulation is for the two binary streams to go through a 2-PAM map into 
digital streams of +a. Each of these streams (over blocks of 256 bits) maintains the orthogonality 
of the 64 Walsh functions. Each of these streams is then passed through a baseband filter with 
a sharp cutoff at the Nyquist bandwidth of 614.4 KH. This is then quadrature modulated onto 
the carrier with a bandwidth of 614.4 KH above and below the carrier, for an overall bandwidth 
of 1.2288 MH. Note that almost all the modulation operation here is digital, with only the 
final filter and modulation being analog. The question of what should be done digitally and 
what in analog form (other than the original binary interface) is primarily a question of ease of 


implementation. 


A block diagram of the modulator is shown in Figure 9.17. 


—>P—| 2PAM | filter & 
f 
28.8 | 6 bits 307.2 de PN cos 
kbps | — 64 bits | kbps 
1228.8 Kbps ‘+ P—]2PAM}+} 5D > filt. -@ 
long PN f { 
Q- PN sin 


Figure 9.17: Block diagram of Source and Channel Encoding 


Next consider the receiver. The fixed PN sequences that have been added to the Walsh functions 
do not alter the orthogonality of the signal set, which now consists of 64 functions, each of length 
256 and each (viewed at baseband) containing both a real and imaginary part. The received 
waveform, after demodulation to baseband and filtering, is passed through a Rake receiver 
similar to the one discussed earlier. The Rake receiver here has a signal set of 64 signals rather 
than 2. Also, the channel here is viewed not as taps at the sampling rate, but rather as 3 taps 
at locations dynamically moved to catch the major received paths. 


As mentioned before, the detection is non-coherent rather than coherent. 


The output of the rake receiver is a likelihood value for each of the 64 hypotheses. This is 
then converted into a likelihood value for each of the 6 bits in the inverse of the 6 bit to 64 bit 
Hadamard code map. 


One of the reasons for using an interleaver between the convolutional code and the Walsh function 
encoder is now apparent. After the Walsh function detection, the errors in the string of 6 bits 
from the detection circuit have highly correlated errors. The Viterbi decoder does not work well 
with bursts of errors, so the interleaver spreads these errors out, allowing the Viterbi decoder to 
operate with noise that is relatively independent from bit to bit. 


9.9.5 Multiaccess Interference in IS95 


A number of cell phones will use the same 1.2288 MH frequency band in communicating with the 
same base station, and other nearby cell phones will also use the same band in communicating 
with their base stations. We now want to understand what kind of interference these cell phones 
cause for each other. Consider the detection process for any given cell phone and the effect of 
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the interference from the other cell phones. 


Since each cell phone uses a different phase of the long PN sequence, the PN sequences from the 
interfering cell phones can be modeled as random iid binary streams. Since each of these streams 
is modeled as iid, the mod-2 addition of the PN stream and the data is still an iid stream of 
binary digits. If the filter used before transmission is very sharp (which it is, since the 1.2288 
MH bands are quite close together), the Nyquist pulses can be approximated by sinc pulses. It 
also makes sense to model the sample clock of each interfering cell phone as being uniformly 
distributed. This means that the interfering cell phones can be modeled as being wide sense 
stationary with a flat spectrum over the 1.2288 MH band. 


The more interfering cell phones there are in the same frequency band, the more interference 
there is, but also, since these interfering signals are independent of each other, we can invoke 
the central limit theorem to see that this aggregate interference will be approximately Gaussian. 


To get some idea of the effect of the interference, assume that each interfering cell phone is 
received at the same baseband energy per information bit given by Ey. Since there are 9600 
information bits per second entering the encoder, the power in the interfering waveform is then 
9600E,. This noise is evenly spread over 2,457,600 dimensions per second, so is (4800/2.4576 x 
10°)E, = E,/512 per dimension. Thus the noise per dimension is increased from No/2 to 
(No/2 + kE,/512) where k is the number of interferers. With this change, (9.80) becomes 
63 —3Ey 

Pr(e) S > exp La zy rom 
In reality, the interfering cell phones are received with different power levels, and because of this, 
the system uses a fairly elaborate system of power control to attempt to equalize the received 
powers of the cell phones being received at a given base station. Those cell phones being received 
at other base stations presumably have lower power at the given base station, and thus cause 
less interference. It can be seen that with a large set of interferers, the assumption that they 
form a Gaussian process is even better than with a single interferer. 


(9.81) 


The factor of 256 in (9.81) is due to the spreading of the waveforms (sending them in a bandwidth 
of 1.2288 MH rather than in a narrow band. This spreading, of course, is also the reason why 
appreciable numbers of other cell phones must use the same band. Since voice users are typically 
silent half the time while in a conversation, and the cell phone need send no energy during these 
silent periods, the number of tolerable interferers is doubled. 


The other types of cellular systems (GSM and TDMA) attempt to keep the interfering cell 
phones in different frequency bands and time slots. If successful, this is, of course, preferable to 
CDMA, since there is then no interference rather than the limited interference in (9.81). The 
difficulty with these other schemes is that frequency slots and time slots must be reused by 
cell phones going to other cell stations (although preferably not by cell phones connected with 
neighboring cell stations). The need to avoid slot re-use between neighboring cells leads to very 
complex algorithms for allocating re-use patterns between cells, and these algorithms cannot 
make use of the factor of 2 due to users being quiet half the time. 


Because these transmissions are narrow band, when interference occurs, it is not attenuated by 
a factor of 256 as in (9.81). Thus the question boils down to whether it is preferable to have a 
large number of small interferers or a small number of larger interferers. This, of course, is only 
one of the issues that differ between CDMA systems and narrow band systems. For example, 
narrow band systems cannot make use of rake receivers, although they can make use of many 
techniques developed over the years for narrow band transmission. 
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9.10 Summary of Wireless Communication 


Wireless communication differs from wired communication primarily in the time-varying nature 
of the channel and the interference from other wireless users. The time-varying nature of the 
channel is the more technologically challenging of the two, and has been the primary focus of 
this chapter. 


Wireless channels frequently have multiple electromagnetic paths of different lengths from trans- 
mitter to receiver and thus the receiver gets multiple copies of the transmitted waveform at 
slightly different delays. If this were the only problem, then the channel could be represented 
as a linear time-invariant (LTI) filter with the addition of noise, and this could be treated as a 
relatively minor extension to the non-filtered channels with noise studied in earlier chapters. 


The problem that makes wireless communication truly different is the fact that the different 
electromagnetic paths are also sometimes moving with respect to each other, thus giving rise to 
different Doppler shifts on different paths. 


Section 9.3 showed that these multiple paths with varying Doppler shifts lead to an input/output 
model which, in the absence of noise, is modeled as a linear time-varying (LTV) filter h(r, ¢), 
which is the response at time ¢ to an impulse 7 seconds earlier. This has a time varying system 
function ACf, t) which, for each fixed t, is the Fourier transform of h(7,t). These LTV filters 
behave in a somewhat similar fashion to the familiar LTT filters. In particular, the channel input 
x(t) and noise-free output y(t) are related by the convolution equation, y(t) = f h(7,t)a(t—T) dr. 
Also, y(t), for each fixed t, is the inverse Fourier transform of #(f)h(f,t). The major difference 
is that 9(f) is not equal to #(f)h(f,t) unless h(f,t) is non-varying in t. 


The major parameters of a wireless channel (at a given carrier frequency f.) are the Doppler 
spread D and the time spread £. The Doppler spread is the difference between the largest and 
smallest significant Doppler shift on the channel (at f.). It was shown to be twice the bandwidth 
of |h( fc, t)| viewed as a function of t. Similarly, £ is the time spread between the longest and 
shortest multipath delay (at a fixed output time to). It was shown to be twice the ‘bandwidth’ 
of |h(f,to)| viewed as a function of f. 


The coherence time 7Z,,,, and coherence frequency Fo, were defined as Too, = aH and Fioh = 
ora Qualitatively, these parameters represent the duration of multipath fades in time and the 
duration over frequency respectively. Fades, as their name suggests, occur gradually, both in 
time and frequency, so these parameters represent duration only in an order-of-magnitude sense. 


As shown in Section 9.4, these bandpass models of wireless channels can be converted to baseband 
models and then converted to discrete time models. The relation between the bandpass and 
baseband model is quite similar to that for non-fading channels. The discrete time model relies 
on the sampling theorem, and, while mathematically correct, can somewhat distort the view of 
channels with a small number of paths, sometimes yielding only one tap, and sometimes yielding 
many more taps than paths. Nonetheless this model is so convenient for acquiring insight about 
wireless channels that it is widely used, particularly among those who dislike continuous-time 
models. 


Section 9.5 then breaks the link with electromagnetic models and views the baseband tapped 
delay line model probabilistically. At the same time, WGN is added. A one-tap model cor- 
responds to situations where the transmission bandwidth is narrow relative to the coherence 
frequency F.o, and multitap models correspond to the opposite case. We generally model the 
individual taps as being Rayleigh faded, corresponding to a large number of small independent 
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paths in the corresponding delay range. Several other models, including the Rician model and 
non-coherent deterministic model, were analyzed, but physical channels have such variety that 
these models only provide insight into the types of behavior to expect. The modeling issues are 
quite difficult here, and our point of view has been to analyze the consequences of a few very 
simple models. 


Consistent with the above philosophy, Section 9.6 analyzes a single tap model with Rayleigh 
fading. The classical Rayleigh fading error probability, using binary orthogonal signals and no 
knowledge of the channel amplitude or phase, is calculated to be 1/[2 + E,/No]. The classical 
error probability for non-coherent detection, where the receiver knows the channel magnitude 
but not the phase, is also calculated and compared with the coherent result as derived for non- 
faded channels. For large E,/No, the results are very similar, saying that knowledge of the phase 
is not very important in that case. However, the non-coherent detector does not use the channel 
magnitude in detection, showing that detection in Rayleigh fading would not be improved by 
knowledge of the channel magnitude. 

The conclusion from this study is that reasonably reliable communication for wireless channels 
needs diversity or coding or needs feedback with rate or power control. With [th order diversity 
in Rayleigh fading, it was shown that error probability tends to 0 as (E}/4No)~” for large 
E,/No. If the magnitude of the various diversity paths are known, then the error probability 
can be made still smaller. 

Knowledge of the channel as it varies can be helpful in two ways. One is to reduce the error 
probability when coding and/or diversity are used, and the other is to exercise rate control or 
power control at the transmitter. Section 9.7 analyzes various channel measurement techniques, 
including direct measurement by sending known probing sequences and measurement using rake 
receivers. These are both widely used and effective tools. 

Finally, all of the above analysis and insight about wireless channels is brought to bear in Section 
9.9, which describes the IS95 CDMA cellular system. In fact, this section illustrates most of the 
major topics throughout this text. 


9A Appendix: Error probability for non-coherent detection 


Under hypothesis U=(a,0), |Vo| is a Rician random variable R which has the density*? 


Tr r? + a’g? rag 
= I > .82 
fal) = gage |e) (Gets), rz 


where Jp is the modified Bessel function of zeroth order. Conditional on U=(0,a), |Vi| has the 
same density, so the likelihood ratio is 


F[(vol, vi) | U=(@, 0) _ Lo(2|volag/WNo) 
F[(vol, lol) | W=(0,a)]— Lo(2|v1lag/WNo) 


Io is known to be monotonic increasing in its argument, which verifies that the maximum 
likelihood decision rule is to choose U=(a,0) if |vo| > |v1| and choose U=(0, a) otherwise. 


(9.83) 


By symmetry, the probability of error is the same for either hypothesis, and is given by 
Pr(e) = Pr {|Vol? < Mil) | U=(a,0)} = Pr {(\Wol? > |MP2) | U=(0,a)}. (9.84) 


3°See, for example, Proakis, [21], p. 304. 
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This can be calculated by straightforward means without any reference to Rician rv’s or Bessel 
functions. We calculate the error probability, conditional on hypothesis U=(a,0), and do this 
by returning to rectangular coordinates. Since the results are independent of the phase ¢; of G; 
for ¢ = 0 or 1, we will simplify our notation by assuming ¢o9 = ¢1 = 0. 


Conditional on U=(a,0), |Vi|? is just |Z1|?. Since the real and imaginary parts of Z; are iid 
Gaussian with variance WNo/2 each, |Z1|? is exponential with mean WNpo. Thus, for any x > 0, 


Pr(|Vi/2 > 2 | U=(a,0)) = exp (-aax] (9.85) 


Next, conditional on hypothesis U=(a,0) and ¢9 = 0, we see from (9.57) that Vo = ag + Zo. 
Letting Vore and Vo,im be the real and imaginary parts of Vo, the probability density of Vore and 
Vo,im, given hypothesis U=(a,0) and ¢9 = 0 is 


(9.86) 


1 [vore — a9]? + % im 
im | U=(a,0)) = = |, 
F026: V0,im | (a, )) InW No /2 exp ( WNo 
We now combine (9.85) and (9.86). All probabilities below are implicitly conditioned on hy- 
pothesis U=(a,0) and ¢9 = 0. For a given observed pair vore, Vo,im, an error will be made if 


2 2 2 
\Vi|" > Ure + U0 im Thus, 


Pr(e) 


i. i, f(0o,20, Yoim | U=(a, 0)) Pr(|Vil? > vse + U2im) d026 d00,m 


= / 1 ee (v0,re = ag)° + U6 im ee _Yir0 5 i U6.im he docs 
= InWNo /2 Pp WN Pp WNo 0,re @U0,im- 


The following equations combine these exponentials, “complete the square” and recognize the 


result as simple Gaussian integrals. 


1 26 re — 24g9V0,re + ag? a 2ug im 
Pr(é) = aWNo/2 exp WNa ; dvore V0,im 


1 /| 1 (Vore = 349)? + im + 4eP? fi de 
_ Xx : i 
2] ] InWNo/4- WNo/2 oe Oe 


1 a’g i 1 (vo,re — $49)? + UB im 4 
= ex x im: 
2 P| OWNo InWNo/4 WN /2 Bore eres 


Integrating the Gaussian integrals, 


1 a?g? 
P = : : 
r(e) 5 exP ( wee (9.87) 
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9.E Exercises 


9.1. (a) Eq. (9.6) is derived under the assumption that the motion is in the direction of the line 
of sight from sending antenna to receiving antenna. Find this field under the assumption 
that there is an arbitrary angle ¢ between the line of sight and the motion of the receiver. 
Assume that the time range of interest is small enough that changes in (0, w) can be ignored. 
(b) Explain why, and under what conditions, it is reasonable to ignore the change in (6, w) 
over small intervals of time. 


9.2. Eq. (9.10) is an approximation to (9.9). Derive an exact expression for the received 
waveform y(t) starting with (9.9). Hint: Express each term in (9.9) as the sum of two 
terms, one the approximation used in (9.10) and the other a correction term. Interpret 
your result. 


9.3. (a) Let rj be the length of the direct path in Figure 9.4. Let r2 be the length of the reflected 
path (summing the path length from the transmitter to ground plane and the path length 
from ground plane to receiver). Show that as r increases, r2 — 11 is asymptotically equal to 
b/r for some constant r; find the value of b. Hint: Recall that for x small, V1 + x = (1+2/2) 
in the sense that [/1+2-—1]/t > 1/2 as 2-0. 


(b) Assume that the received waveform at the receiving antenna is given by 


E.(f,t) = RK laexp{2ri| ft — fri/c]] Rlaexp{2ri| ft — frofe] (a) 


rl T2 


Approximate the denominator rz by r; in (a) and show that E, ~ @/r? for r~! much 
smaller than c/f. Find the value of (. 

(c) Explain why this asymptotic expression remains valid without first approximating the 
denominator rz in (a) by ry. 


9.4. Evaluate the channel output y(t) for an arbitrary input x(t) when the channel is modeled 
by the multipath model of (9.14). Hint: The argument and answer are very similar to that 
in (9.20), but you should think through the possible effects of time-varying attenuations 


B;(t). 


9.5. (a) Consider a wireless channel with a single path having a Doppler shift D,. Assume that 
the response to an input exp{27ift} is y(t) = exp{27it(f + D,)}. Evaluate the Doppler 
spread D and the midpoint between minimum and maximum Doppler shifts A. Evaluate 
ACf, an |ACf, t)|, wf, t) and lb f, t)| for win (9.24). Find the envelope of the output when 
the input is cos(27 ft). 

(b) Repeat part (a) where y,(t) = exp{27it( f + D,)} + exp{2zitf}. 


9.6. (a) Bandpass envelopes: Let y s(t) = e2Ith( f,t) be the response of a multipath channel 
to e27™J* and assume that f is much larger than any of the channel Doppler shifts. Show 
that the envelope of R[ys(t)] is equal to |y¢(t)|. 

(b) Find the power (#[y,(t)])? and consider the result of lowpass filtering this power wave- 
form. Interpret this filtered waveform as a short-term time-average of the power and relate 
the square root of this time-average to the envelope of R[y,(t)]. 
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9.7. Equations (9.34) and (9.35) give the baseband system function and impulse response for 
the simplified multipath model. Rederive those formulas using the slightly more general 
multipath model of (9.14) where each attenuation 3; can depend on t but not f. 


9.8. It is common to define Doppler spread for passband communication as the Doppler spread 
at the carrier frequency and to ignore the change in Doppler spread over the band. If f. is 1 
eH and W is 1 mH, find the percentage error over the band in making this approximation. 


9.9. This illustrates why the tap gain corresponding to the sum of a large number of potential 
independent paths is not necessarily well approximated by a Gaussian distribution. Assume 
there are N possible paths and each appears independently with probability 2/N. To make 
the situation as simple as possible, suppose that if path n appears, its contribution to a 
given random tap gain, say Goo, is equiprobably +1, with independence between paths. 
That is, 


N 
Goo = ye On@n, 


nal 


where ¢}, ¢2,... ,@yn are iid random variables taking on the value 1 with probability 2/N 
and taking on the value 0 otherwise and 61,... ,@y are iid and equiprobably +1. 


(a) Find the mean and variance of Goo for any N > 1 and take the limit as N — oo. 


(b) Give a common sense explanation of why the limiting rv is not Gaussian. Explain why 
the central limit theorem does not apply here. 


(c) Give a qualitative explanation of what the limiting distribution of Goo looks like. If 
this sort of thing amuses you, it is not hard to find the exact distribution. 


9.10. Let g(f,t) be the baseband equivalent system function for a linear time-varying filter, and 
consider baseband inputs u(t) limited to the frequency band (—W/2,W/2). Define the 
baseband limited impulse response g(r, t) by 


w/2 
se = i] a(t) exp{2mifr} af. 


—w/2 


a) Show that the output v(t) for input u(t) is 


u(t) = fue — T)g(r, t) dr. 


b) For the discrete-time baseband model of (9.41), find the relationship between gx m and 
g(k/W, m/W). Hint: it is a very simple relationship. 


c) Let G(r, t) be a random variable whose sample values are g(7,t) and define 
1 
Rar |= wWetGtr, tHG*(7,t+t)}. 


What is the relationship between R(r,t’) and R(k,n) in (9.46)? 


d) Give an interpretation to [| R(r,0)dr and indicate how it might change with W. Can 
you explain, from this, why (r,t) is defined using the scaling factor W? 
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9.11. (a) Average over gain in the non-coherent detection result in (9.59) to rederive the Rayleigh 
fading error probability. 
(b) Assume narrow-band fading with a single tap G,,. Assume that the sample value of the 
tap magnitude, |g,,| is measured perfectly and fed back to the transmitter. Suppose that 
the transmitter, using pulse position modulation, chooses the input magnitude dynamically 
so as to maintain a constant received signal to noise ratio. That is, the transmitter sends 
a/|gm| instead of a. Find the expected transmitted energy per binary digit. 


9.12. Consider a Rayleigh fading channel in which the channel can be described by a single 
discrete-time complex filter tap G,,. Consider binary communication where, for each pair 
of time-samples, one of two equiprobable signal pairs is sent, either (a,a) or (a,—a). The 
output at discrete times 0 and 1 is given by 


Vin=UmG+Zm 3 m=0,1. 


The magnitude of G has density f(|g|) = 2|g|exp{—|g|?}; |g| > 0. G is is the same for 
m = 0,1 and is independent of Z and Z,, which in turn are iid circularly symmetric 
Gaussian with variance No/2 per real and imaginary part. Explain your answers in each 
part. 


(a) Consider the noise transformation 


21+ Zo ; gi — 41-40 
V2 : V2 
Show that Z{ and Z} are statistically independent and give a probabilistic characterization 
of them. 
(b) Let 


ee 


Vi + Vo Vie Vier 

V2 : V2 
Give a probabilistic characterization of (Vj, Vi) under U=(a,a) and under U=(a, —a). 
(c) Find the log likelihood ratio A(vj,v{) and find the MAP decision rule for using v9, v{ 


to choose U=(a, a) or (a, —a). 


Vo = 


(d) Find the probability of error using this decision rule. 
e) Is the pair Vo, V; a function of Vj, V/? Why is this question relevant? 
Ot 


9.13. Consider the two-tap Rayleigh fading channel of Example 9.8.1. The input U = Uo, U1,... , 
is one of two possible hypotheses, either u? = (\/E5,0,0,0) or ut = (0,0, Ey, 0) where 
U, = 0 for 2 > 4 for both hypotheses. The output is a discrete time complex sequence 
V=W,Vi,.-. , given by 


Vin = Go,mUm a GimUm-1 + Zm- 


For each m, Go,m and Gj,m are iid and circularly symmetric complex Gaussian rv’s with 
Gom ~ CN(0,1/2) for m both 0 and 1. The correlation of Gom and Gi, with m is 
immaterial, and can be assumed uncorrelated. Assume that the sequence Zm ~ CN(0, No) 
is a sequence of iid circularly symmetric rv’s. The signal, the noise, and the channel taps are 
all independent. As explained in the example, the energy vector X = (Xo, X1, X2, X3)', 
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where Xin = |Vin|? is a sufficient statistic for the hypotheses u° and u!. Also, as explained 
there, these energy variables are independent and exponential given the hypothesis. More 
specifically, define a = EiJ2ENo and 3 = ae Then, given U = u®, the variables Xo and 
X} each have the density ae~® and X» and X3 each have the density Be~%", all for x > 0. 
Given U = u!, these densities are reversed. 

(a) Give the probability density of X conditional on u?. 

(b) Show that the log likelihood ratio is given by 


LLR(x) = (6 — a)(xo+21—22—23). 


(c) Let Yo = Xo + Xy and let Yj = X2+ X3. Find the probability density and the 


distribution function for Yo and Y; conditional on wu”. 


(d) Conditional on U = u®, observe that the probability of error is the probability that Y; 
exceeds Yo. Show that this is given by 


p _ 8a2B+a03 _ 4+ 3xp 
"= +a 94 By \* 
(2+ axe) 


Hint: To derive the second expression, first convert the first expression to a function of 
B/a. Recall that Deg e Ydy = ie ye Ydy =1 and ies yre Ydy = 2. 


(ec) Explain why the assumption that G;,; and Gy; are uncorrelated for 1 # j was not 
needed. 


9.14. (Lth order diversity) This exercise derives the probability of error for Lth order diversity 
on a Rayleigh fading channel. For the particular model described at the end of Section 9.8, 
there are L taps in the tapped delay line model for the channel. Each tap k multiplies the 
input by Gem ~ CN(0,1/L), 0 < k < L—-1. The binary inputs are wu? = (./F;,0,... ,0 
and u! = (0,... ,0, /E,0,... ,0), where u° and wu! contain the signal at times 0 and L 
respectively. 

The complex received signal at time m is Vn = ae GkmUm—K+ 4m for 0 < m.< 2h-1, 
where Zn, ~ CN(0, No) is independent over time and independent of the input and channel 
tap gains. As shown in Section 9.8, the set of energies, Xin = |Vm|?, 0 < m < 2L—1 are 
conditionally independent, given either u? or u!, and constitute a sufficient statistic for 
detection; the ML detection rule is to choose u° if eae Aig ae Xm and choose 
u! otherwise. Finally, conditional on u°, Xo,... ,Xp—1 are exponential with mean No + 
VE,/L. Thus for 0 < m < L, Xm has the density wexp(—aXm) where a = 
Similarly, for LD <m < 2L, Xj, has the density 3 exp(—GX) where 3 = ne 
(a) The following parts of the exercise demonstrate a simple technique to calculate the 
probability of error Pr(e) conditional on either hypothesis. This is the probability that the 
sum of LF iid exponential rv’s of rate a is less than the sum of L iid exponential rv’s of rate 
GB = No. View the first sum, i.e., ase Xm (given ug) as the time of the Zth arrival in 
a Poisson process of rate a and view the second sum, ys Fs Xm, as the time of the [th 
arrival in a Poisson process of rate 3 (see Figure 9.18). Note that the notion of time here 
has nothing to do with the actual detection problem and is strictly a mathematical artifice 
for viewing the problem in terms of Poisson processes. 


1 
Ey /L+No° 
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oS 2 
| XL | XL+1 | X42 | 


Figure 9.18: A Poisson process with interarrival times {X;,;0 < k < L}, and another with 
interarrival times {X,42;0 < € < L}. The combined process can be shown to be a Poisson 
process of rate a+ (. 


Show that Pr(e) is the probability that, out of the first 20 — 1 arrivals in the combined 
Poisson process above, at least [ of those arrivals are from the first process. 


(b) Each arrival in the combined Poisson process is independently drawn from the first 
process with probability p = a8 and from the second process with probability 1—p = ay. 
Show that 


2L-1 


Pr(e)= Do (an ‘oa ae) ama 


l=L 


(c) Express this result in terms of a and ( and then in terms of fe. 

(d) Use the result above to re-calculate Pr(e) for Rayleigh fading without diversity (i.e., 
with L = 1). Use it with £ = 2 to validate the answer in Exercise 9.13. 

(e) Show that Pr(e) for very large E,/No decreases with increasing L as [E,/(4No)|”. 

(f) Show that Pr(e) for Lth order diversity (using ML detection as above) is exactly the 
same as the probability of error that would result by using (2L — 1) order diversity, making 
a hard decision on the basis of each diversity output, and then using majority rule to make 
a final decision. 


9.15. Consider a wireless channel with two paths, both of equal strength, operating at a carrier 
frequency f.. Assume that the baseband equivalent system function is given by 


WF, t) = 1+ exp{io} exp[—27i(f + fe) 72(t)]. (9.88) 


(a) Assume that the length of path 1 is a fixed value ro and the length of path 2 is 
ro + Ar + vt. Show (using (9.88)) that 


O(f,t) % 1+ expfid} exp ee (& i) | 


; (9.89) 


Explain what the parameter 7 is in (9.89); also explain the nature of the approximation 
concerning the relative values of f and fi. 

(b) Discuss why it is reasonable to define the multipath spread £ here as Ar/c and to 
define the Doppler spread D as f.u/c. 

(c) Assume that 7 = 0, z.e., that g(0,0) = 2. Find the smallest t > 0 such that g(0,t) = 0. 
It is reasonable to denote this value t as the coherence time Ton of the channel. 

(d) Find the smallest f > 0 such that g(f,0) = 0. It is reasonable to denote this value of 
f as the coherence frequency Foon, of the channel. 
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9.16. Union bound: Let E 1, Fo,...,H, be independent events each with probability p. 
(a) Show that Pr(Uf_) E;) =1-(1-p)*. 
(b) Show that pk — (pk)?/2 < Pr(U#_, E)) < pk. Hint: One approach is to demonstrate 
equality at p = 0 and then demonstrate the inequality for the derivitive of each term with 
respect to p. For the first inequality, demonstrating the inequality for the derivitive can be 
done by looking at the second derivitive. 


9.17. (a) Let wu be an ideal PN sequence, satisfying 7) uguz,, = 2a7nd,. Let b = ux g for some 
channel tap gain g. Show that ||b||? = ||?||?|/g||?.. Hint: One approach is to convolve b 
with its matched filter bt. Use the commutativity of convolution along with u* ul. b* as 
g*u and look at the result of passing b through a filter matched to itself. (b) If u® and ut 
are each ideal PN sequences as in part (a), show that bp = u° * g and b; = u! * g satisfy 
|| Boll? = || oll?. 


9.18. This exercise explores the difference between a rake receiver that estimates the analog 

baseband channel and one that estimates a discrete-time model of the baseband channel. 
Assume that the channel is estimated perfectly in each case, and look at the resulting 
probability of detecting the signal incorrectly. 
We do this, somewhat unrealistically, with a 2-PAM modulator sending sinc(t) given H=0 
and —sinc(t) given H=1. We assume a channel with two paths having an impulse response 
6(t) — 6(t—e) where 0 < ¢ <1. The received waveform, after demodulation from passband 
to baseband is 


V(t) = 4[sinc(t) — sinc(t — ¢)] + Z(t), 


where Z(t) is WGN of spectral density No/2. We have assumed for simplicity that the 
phase angles due to the demodulating carrier are 0. 

(a) Describe the ML detector for the analog case where the channel is perfectly known at 
the receiver. 

(b) Find the probability of error Pr(e) in terms of the energy of the low pass received signal, 
E = |sinc(t) — sinc(t—e)||?. 

(c) Approximate E by using the approximation sinc(t—e) ~ sinc(t) —esinc’(t). Hint: recall 
the Fourier transform pair u/(t) @ 27ifu(f). 

(d) Next consider the discrete-time model where, since the multipath spread is very small 
relative to the signaling interval, the discrete channel is modeled with a single tap g. The 
sampled output at epoch 0 is +g[1 — sinc(—e)] + Z(0). We assume that Z(t) has been 
filtered to the baseband bandwidth W = 1/2. Find the probability of error using this 
sampled output as the observation and assuming that g is known. 

(e) The probability of error for both the result in (d) and the result in (b) and (c) approach 
1/2 as € + 0. Contrast the way in which each result approaches 1/2. 


(f) Try to explain why the discrete approach is so inferior to the analog approach here. 
Hint: What is the effect of using a single tap approximation to the sampled low pass channel 
model. 
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